Pub Date : 2026-01-13DOI: 10.1016/j.eswa.2026.131136
Bing-Tao Wang , Quan-Ke Pan , Shengxiang Yang , Xue-Lei Jing , WeiMin Li
The research addresses a hybrid flowshop scheduling problem incorporating worker competency constraints. Unlike most existing studies that assume workers can operate all machines, our work accounts for the absence of certain worker skills. The added constraints substantially increase the problem’s complexity, rendering traditional algorithms inadequate for obtaining feasible solutions. Therefore, a mixed-integer programming model is formulated, and a variable representation cooperative co-evolutionary algorithm (VRCCEA) is designed to achieve makespan minimization. Based on the decomposition idea, we use two populations to address the multi-coupled problem and implement a cooperative mechanism by introducing a solution archive to promote the co-evolution of populations. Given the limitations of a single encoding–decoding strategy, a variable representation mechanism is provided to balance the exploration scale and search efficiency. To prevent the failures of worker assignment, we design a heuristic based on resource constraint matrix (RCM), which conducts a greedy search within the feasible region. For the problem-specific knowledge, a reduced insertion neighborhood and an accelerated evaluation strategy are proposed to swiftly identify the best neighborhood solution. Finally, analytical experiments show the practical value of the algorithmic components and demonstrate that VRCCEA significantly outperforms five advanced metaheuristics.
{"title":"Scheduling a Constrained Hybrid Flowshop Using a Variable Representation Cooperative Co-Evolutionary Algorithm","authors":"Bing-Tao Wang , Quan-Ke Pan , Shengxiang Yang , Xue-Lei Jing , WeiMin Li","doi":"10.1016/j.eswa.2026.131136","DOIUrl":"10.1016/j.eswa.2026.131136","url":null,"abstract":"<div><div>The research addresses a hybrid flowshop scheduling problem incorporating worker competency constraints. Unlike most existing studies that assume workers can operate all machines, our work accounts for the absence of certain worker skills. The added constraints substantially increase the problem’s complexity, rendering traditional algorithms inadequate for obtaining feasible solutions. Therefore, a mixed-integer programming model is formulated, and a variable representation cooperative co-evolutionary algorithm (VRCCEA) is designed to achieve makespan minimization. Based on the decomposition idea, we use two populations to address the multi-coupled problem and implement a cooperative mechanism by introducing a solution archive to promote the co-evolution of populations. Given the limitations of a single encoding–decoding strategy, a variable representation mechanism is provided to balance the exploration scale and search efficiency. To prevent the failures of worker assignment, we design a heuristic based on resource constraint matrix (RCM), which conducts a greedy search within the feasible region. For the problem-specific knowledge, a reduced insertion neighborhood and an accelerated evaluation strategy are proposed to swiftly identify the best neighborhood solution. Finally, analytical experiments show the practical value of the algorithmic components and demonstrate that VRCCEA significantly outperforms five advanced metaheuristics.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"308 ","pages":"Article 131136"},"PeriodicalIF":7.5,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.eswa.2026.131185
Haoxiang Lei, Jianbo Su
Accurate trajectory prediction of unmanned aerial vehicles (UAVs) is crucial for effective anti-UAV defense. However, existing methods are typically developed under ideal conditions and fail to maintain robustness under diverse disturbances. To address this challenge, we propose a Teacher-Student framework for UAV state estimation and trajectory forecasting that enhances reliability across diverse disturbances. The framework integrates diffusion-based denoising and audio-visual feature fusion to extract robust motion states, while pseudo-state supervision is derived from kinematic modeling and CAD-guided pose estimation. Experimental results demonstrate that our method consistently outperforms state-of-the-art baselines across both ideal and disturbed scenarios, achieving accurate long-horizon predictions essential for real-world anti-UAV applications. Code will be released to support future research in robust anti-UAV systems at https://github.com/hxlei0827/Robust-Anti-UAV-Under-Diverse-Disturbances.
{"title":"Robust UAV trajectory prediction under diverse disturbances via teacher-student framework","authors":"Haoxiang Lei, Jianbo Su","doi":"10.1016/j.eswa.2026.131185","DOIUrl":"10.1016/j.eswa.2026.131185","url":null,"abstract":"<div><div>Accurate trajectory prediction of unmanned aerial vehicles (UAVs) is crucial for effective anti-UAV defense. However, existing methods are typically developed under ideal conditions and fail to maintain robustness under diverse disturbances. To address this challenge, we propose a Teacher-Student framework for UAV state estimation and trajectory forecasting that enhances reliability across diverse disturbances. The framework integrates diffusion-based denoising and audio-visual feature fusion to extract robust motion states, while pseudo-state supervision is derived from kinematic modeling and CAD-guided pose estimation. Experimental results demonstrate that our method consistently outperforms state-of-the-art baselines across both ideal and disturbed scenarios, achieving accurate long-horizon predictions essential for real-world anti-UAV applications. Code will be released to support future research in robust anti-UAV systems at <span><span>https://github.com/hxlei0827/Robust-Anti-UAV-Under-Diverse-Disturbances</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"308 ","pages":"Article 131185"},"PeriodicalIF":7.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.eswa.2026.131183
Zongshi Liu , Guojian Zou , Ting Wang , Meiting Tu , Hongwei Wang , Ye Li
The coexistence of connected and automated vehicles (CAVs) and human-driven vehicles (HDVs) introduces complex non-linear dynamics, characterized by stop-and-go wave noise and velocity separation, making real-time safety risk assessment difficult. Current research on crash/conflict prediction in mixed CAV-HDV traffic remains limited, existing risk assessment models, which predominantly rely on linear Euclidean distances or instantaneous feature similarity, often misinterpret non-conflict fluctuations as crash precursors, resulting in unstable performance and high false alarm rates. To address this, we propose a Manifold Similarity Spatiotemporal Graph Network (MS-STGNet) tailored for robust real-time conflict prediction in mixed freeway traffic. Unlike distinguishing traffic states in a linear space, this model constructs a manifold-based traffic-state similarity graph to capture the intrinsic geometric structure of traffic evolution. It integrates physical adjacency with semantic neighbors and combines residual feature extraction, temporal convolution, and an adaptive fusion gate to learn spatiotemporal risk patterns. We evaluated the framework’s performance under mixed traffic scenarios with varying penetration rates of CAVs and HDVs. The experimental results demonstrate that MS-STGNet achieves consistently exceptional and stable performance across varying market penetration levels and traffic scenarios. Compared to state-of-the-art baseline models, it delivers higher predictive accuracy and substantially lower false alarm rates. The methodologies and outcomes presented in this study have the potential to be used for real-time mixed traffic control on intelligent highways and crash prevention in real-time crash risk warnings at high-risk locations.
{"title":"Learning and predicting traffic conflicts in mixed traffic: A spatiotemporal graph neural network with manifold similarity learning","authors":"Zongshi Liu , Guojian Zou , Ting Wang , Meiting Tu , Hongwei Wang , Ye Li","doi":"10.1016/j.eswa.2026.131183","DOIUrl":"10.1016/j.eswa.2026.131183","url":null,"abstract":"<div><div>The coexistence of connected and automated vehicles (CAVs) and human-driven vehicles (HDVs) introduces complex non-linear dynamics, characterized by stop-and-go wave noise and velocity separation, making real-time safety risk assessment difficult. Current research on crash/conflict prediction in mixed CAV-HDV traffic remains limited, existing risk assessment models, which predominantly rely on linear Euclidean distances or instantaneous feature similarity, often misinterpret non-conflict fluctuations as crash precursors, resulting in unstable performance and high false alarm rates. To address this, we propose a Manifold Similarity Spatiotemporal Graph Network (MS-STGNet) tailored for robust real-time conflict prediction in mixed freeway traffic. Unlike distinguishing traffic states in a linear space, this model constructs a manifold-based traffic-state similarity graph to capture the intrinsic geometric structure of traffic evolution. It integrates physical adjacency with semantic neighbors and combines residual feature extraction, temporal convolution, and an adaptive fusion gate to learn spatiotemporal risk patterns. We evaluated the framework’s performance under mixed traffic scenarios with varying penetration rates of CAVs and HDVs. The experimental results demonstrate that MS-STGNet achieves consistently exceptional and stable performance across varying market penetration levels and traffic scenarios. Compared to state-of-the-art baseline models, it delivers higher predictive accuracy and substantially lower false alarm rates. The methodologies and outcomes presented in this study have the potential to be used for real-time mixed traffic control on intelligent highways and crash prevention in real-time crash risk warnings at high-risk locations.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"309 ","pages":"Article 131183"},"PeriodicalIF":7.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.eswa.2026.131145
Zhuo Chen , Zhong Chen , Bofeng Long , Tongzhe Liu , Ming Yao
Currently, many color image encryption algorithms suffer from weak inter-channel interactions and lack sensitivity in their permutation and diffusion mechanisms. Therefore, this paper presents a novel 4D hyperchaotic system along with two types of 3-layer Peano curves (3LPC), based on which an efficient and secure cross-channel color image encryption scheme is proposed. Firstly, we incorporate the absolute value term and implement nonlinear coupling within the hyperchaotic system, which is proven to have complex dynamic behavior and a wide hyperchaotic interval through attractor diagrams, Lyapunov exponent spectrums, bifurcation diagrams, Poincaré section diagrams, and NIST test. Secondly, to overcome the challenges of cross-channel coverage and low utilization in current space-filling curves used for image encryption, we creatively propose the 3LPC that corresponds to the three-channel physical structure of color images. Based on different application scenarios, we design two distinct adaptation strategies to efficiently meet the encryption requirements of various stages in the encryption process. Finally, we propose a cross-channel technique with cross-channel random permutation and bidirectional spatial diffusion. By optimizing the 3LPC, the RGB channels of the color image are interconnected, ensuring that the influence of pixel encryption extends beyond individual channels to encompass the entire image structure. Meanwhile, the chaotic matrix generated by the 4D hyperchaotic system guarantees randomness in the encryption process, significantly increasing key sensitivity and key space. Simulation results and security analyses demonstrate that the proposed encryption scheme exhibits excellent permutation and diffusion properties, effectively resisting a range of illegal attacks.
{"title":"A cross-channel color image encryption scheme based on a novel 4D hyperchaotic system and 3-layer Peano curve","authors":"Zhuo Chen , Zhong Chen , Bofeng Long , Tongzhe Liu , Ming Yao","doi":"10.1016/j.eswa.2026.131145","DOIUrl":"10.1016/j.eswa.2026.131145","url":null,"abstract":"<div><div>Currently, many color image encryption algorithms suffer from weak inter-channel interactions and lack sensitivity in their permutation and diffusion mechanisms. Therefore, this paper presents a novel 4D hyperchaotic system along with two types of 3-layer Peano curves (3LPC), based on which an efficient and secure cross-channel color image encryption scheme is proposed. Firstly, we incorporate the absolute value term and implement nonlinear coupling within the hyperchaotic system, which is proven to have complex dynamic behavior and a wide hyperchaotic interval through attractor diagrams, Lyapunov exponent spectrums, bifurcation diagrams, Poincaré section diagrams, and NIST test. Secondly, to overcome the challenges of cross-channel coverage and low utilization in current space-filling curves used for image encryption, we creatively propose the 3LPC that corresponds to the three-channel physical structure of color images. Based on different application scenarios, we design two distinct adaptation strategies to efficiently meet the encryption requirements of various stages in the encryption process. Finally, we propose a cross-channel technique with cross-channel random permutation and bidirectional spatial diffusion. By optimizing the 3LPC, the RGB channels of the color image are interconnected, ensuring that the influence of pixel encryption extends beyond individual channels to encompass the entire image structure. Meanwhile, the chaotic matrix generated by the 4D hyperchaotic system guarantees randomness in the encryption process, significantly increasing key sensitivity and key space. Simulation results and security analyses demonstrate that the proposed encryption scheme exhibits excellent permutation and diffusion properties, effectively resisting a range of illegal attacks.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"309 ","pages":"Article 131145"},"PeriodicalIF":7.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maintaining healthy posture and attention is essential for effective learning, yet conventional monitoring systems are often costly and limited to single modalities. This study presents a multimodal expert system integrating self-powered triboelectric sensors and visual information for real-time classroom monitoring. A smart seat cushion generates an open-circuit voltage of 168 V under 20 N, enabling stable, energy-autonomous signal acquisition. The triboelectric branch employs a CNN–LSTM–STAN architecture to capture temporal patterns and salient features, achieving 96.4% accuracy in classifying seven sitting postures. The visual branch utilizes MobileNetV3 integrated with CBAM to efficiently extract facial features. Subsequently, features from both the triboelectric and visual branches are fused via concatenation for joint posture and attention assessment, achieving an attention recognition accuracy of 93.2%. These results demonstrate the effectiveness of combining self-powered sensing with deep learning-based multimodal analysis for robust, real-time classroom behavioral assessment.
{"title":"Multimodal self-powered sensing and deep learning for posture and attention assessment in classrooms","authors":"Junjie Tang, Manyun Zhang, Yutong Wang, Zhiyuan Zhu","doi":"10.1016/j.eswa.2026.131176","DOIUrl":"10.1016/j.eswa.2026.131176","url":null,"abstract":"<div><div>Maintaining healthy posture and attention is essential for effective learning, yet conventional monitoring systems are often costly and limited to single modalities. This study presents a multimodal expert system integrating self-powered triboelectric sensors and visual information for real-time classroom monitoring. A smart seat cushion generates an open-circuit voltage of 168 V under 20 N, enabling stable, energy-autonomous signal acquisition. The triboelectric branch employs a CNN–LSTM–STAN architecture to capture temporal patterns and salient features, achieving 96.4% accuracy in classifying seven sitting postures. The visual branch utilizes MobileNetV3 integrated with CBAM to efficiently extract facial features. Subsequently, features from both the triboelectric and visual branches are fused via concatenation for joint posture and attention assessment, achieving an attention recognition accuracy of 93.2%. These results demonstrate the effectiveness of combining self-powered sensing with deep learning-based multimodal analysis for robust, real-time classroom behavioral assessment.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"308 ","pages":"Article 131176"},"PeriodicalIF":7.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1016/j.eswa.2026.131200
Feihong Ma , Jia Ma , Meng Chen , Yuliang Li , JunLiang Wang
Most research on 3D object shape recognition is unimodal and lacks interpretability. To address issues of missing point cloud and poor interpretability of visual perception when acquiring the object shape information under occlusion, a novel multi-modal model based on the fusion of visual and tactile point cloud information for 3D shape recognition and interpretability is proposed. First, the experimental acquisition platform for visual and tactile point clouds is constructed, which facilitates the collection of visual and tactile point clouds of objects under self-occlusion conditions. Second, a shape recognition model based on the fusion of multiple attention mechanisms for visual and tactile point clouds has been established, which is used to extract features from the preprocessed visual and tactile point clouds. The instance and class accuracies of the Dual-VT-Multi-attention model on the self-built dataset are 80.32% and 83.32%, respectively, which are significantly higher than single visual or tactile modal. Finally, to provide an intuitive interpretation of the classification decision process of the Dual-VT-Multi-attention model in each modal, an interpretable method based on the recognition model of visual and tactile point clouds is proposed. The contribution of each point can be calculated by weighting the summation of its feature vectors, which allows the generation of the Class Attention Response Map to visualize the points that are important for the model’s classification decision. The Class Attention Response Map makes the shape recognition result of Dual-VT-Multi-attention model in each modal more transparent and interpretable.
{"title":"3D shape recognition and interpretability model based on the fusion of real visual and tactile point clouds","authors":"Feihong Ma , Jia Ma , Meng Chen , Yuliang Li , JunLiang Wang","doi":"10.1016/j.eswa.2026.131200","DOIUrl":"10.1016/j.eswa.2026.131200","url":null,"abstract":"<div><div>Most research on 3D object shape recognition is unimodal and lacks interpretability. To address issues of missing point cloud and poor interpretability of visual perception when acquiring the object shape information under occlusion, a novel multi-modal model based on the fusion of visual and tactile point cloud information for 3D shape recognition and interpretability is proposed. First, the experimental acquisition platform for visual and tactile point clouds is constructed, which facilitates the collection of visual and tactile point clouds of objects under self-occlusion conditions. Second, a shape recognition model based on the fusion of multiple attention mechanisms for visual and tactile point clouds has been established, which is used to extract features from the preprocessed visual and tactile point clouds. The instance and class accuracies of the Dual-VT-Multi-attention model on the self-built dataset are 80.32% and 83.32%, respectively, which are significantly higher than single visual or tactile modal. Finally, to provide an intuitive interpretation of the classification decision process of the Dual-VT-Multi-attention model in each modal, an interpretable method based on the recognition model of visual and tactile point clouds is proposed. The contribution of each point can be calculated by weighting the summation of its feature vectors, which allows the generation of the Class Attention Response Map to visualize the points that are important for the model’s classification decision. The Class Attention Response Map makes the shape recognition result of Dual-VT-Multi-attention model in each modal more transparent and interpretable.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"308 ","pages":"Article 131200"},"PeriodicalIF":7.5,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-11DOI: 10.1016/j.eswa.2026.131186
Yixin Liu , Zhihao Zhang , Lingling Wang , Li Fu , Xiaohong Liu
Vestibular perception is essential for human spatial navigation, providing vital information about motion and orientation. However, existing vestibular research overlooks how the brain dynamically interprets self-motion. We present the Time Domain Compound Attention (TDCA) network, a model that decodes directional states from electroencephalography (EEG). TDCA employs multiscale temporal convolutions to capture both transient and sustained neural dynamics. A self-attention module highlights informative spatial-feature representations, while a temporal convolution module integrates them over time. Using a dataset of vestibular direction perception with synchronized EEG and inertial measurement unit (IMU) recordings from 20 participants performing five motion states (left, right, forward, backward, and stationary), TDCA achieved 93.97% accuracy under subject-independent tenfold cross-validation. Beyond its high decoding accuracy, TDCA’s temporal predictions exhibit strong alignment with an IMU-driven vestibular model, providing biophysically grounded, dual-path validation and supporting its physiological plausibility. These findings advance brain-inspired navigation research and demonstrate the feasibility of online brain-computer interfaces under natural vestibular stimulation.
{"title":"A time domain compound attention neural network for direction perception with vestibular model verification","authors":"Yixin Liu , Zhihao Zhang , Lingling Wang , Li Fu , Xiaohong Liu","doi":"10.1016/j.eswa.2026.131186","DOIUrl":"10.1016/j.eswa.2026.131186","url":null,"abstract":"<div><div>Vestibular perception is essential for human spatial navigation, providing vital information about motion and orientation. However, existing vestibular research overlooks how the brain dynamically interprets self-motion. We present the Time Domain Compound Attention (TDCA) network, a model that decodes directional states from electroencephalography (EEG). TDCA employs multiscale temporal convolutions to capture both transient and sustained neural dynamics. A self-attention module highlights informative spatial-feature representations, while a temporal convolution module integrates them over time. Using a dataset of vestibular direction perception with synchronized EEG and inertial measurement unit (IMU) recordings from 20 participants performing five motion states (left, right, forward, backward, and stationary), TDCA achieved 93.97% accuracy under subject-independent tenfold cross-validation. Beyond its high decoding accuracy, TDCA’s temporal predictions exhibit strong alignment with an IMU-driven vestibular model, providing biophysically grounded, dual-path validation and supporting its physiological plausibility. These findings advance brain-inspired navigation research and demonstrate the feasibility of online brain-computer interfaces under natural vestibular stimulation.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"309 ","pages":"Article 131186"},"PeriodicalIF":7.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145969396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-11DOI: 10.1016/j.eswa.2026.131164
Fei Hao , Xiaofeng Zhang , Yepeng Liu , Yujuan Sun , Hua Wang , Lin Yang , Ren Wang
Predicting future trends based on historical data is essential in real-world applications such as industrial energy planning and urban Traffic planning. However, due to the inherent complexity of real-world data, it often exhibits non-stationary, making it difficult for models to capture latent features and leading to a decline in forecasting performance. In this study, FTdasc is proposed to address this challenge. FTdasc combines frequency- and time-domain information for decomposition, effectively capturing long-range and short-range dependencies within the time series. Additionally, it integrates inter-channel with intra-channel information to provide a more comprehensive feature representation. More importantly, FTdasc introduces a Stationarity correction method based on temporal dependencies, which restores non-Stationary information by constraining the data distribution. Experimental results on ten benchmark datasets demonstrate that FTdasc is highly robust and effective for both long- and short-term time series forecasting. Code availability: https://github.com/hao-fei-hub/FTdasc.
{"title":"FTdasc: A frequency-Time domain approach with stationarity correction for multivariate time series forecasting","authors":"Fei Hao , Xiaofeng Zhang , Yepeng Liu , Yujuan Sun , Hua Wang , Lin Yang , Ren Wang","doi":"10.1016/j.eswa.2026.131164","DOIUrl":"10.1016/j.eswa.2026.131164","url":null,"abstract":"<div><div>Predicting future trends based on historical data is essential in real-world applications such as industrial energy planning and urban Traffic planning. However, due to the inherent complexity of real-world data, it often exhibits non-stationary, making it difficult for models to capture latent features and leading to a decline in forecasting performance. In this study, FTdasc is proposed to address this challenge. FTdasc combines frequency- and time-domain information for decomposition, effectively capturing long-range and short-range dependencies within the time series. Additionally, it integrates inter-channel with intra-channel information to provide a more comprehensive feature representation. More importantly, FTdasc introduces a Stationarity correction method based on temporal dependencies, which restores non-Stationary information by constraining the data distribution. Experimental results on ten benchmark datasets demonstrate that FTdasc is highly robust and effective for both long- and short-term time series forecasting. Code availability: <span><span>https://github.com/hao-fei-hub/FTdasc</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"308 ","pages":"Article 131164"},"PeriodicalIF":7.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-11DOI: 10.1016/j.eswa.2026.131170
Qian Shen , Lei Zhang , Yan Zhang , Yuxiang Zhang , Shihao Liu , Yi Li
<div><div>In recent years, researchers have employed image classification and object detection methods to recognize distracted driving behaviors (DDB). Nevertheless, a comprehensive comparative analysis of these two methods within the realm of distracted driving behavior recognition (DDR) remains underexplored, resulting in most existing algorithms struggling to balance efficiency and accuracy. Therefore, based on a comparative analysis of these two methods, this paper proposes a novel DDR algorithm named DDR-YOLO inspired by YOLO11. Initially, this paper explores the method that performs better in DDR using 250,000 manually labeled images from the 100-Drivers dataset. Furthermore, the lightweight DDR-YOLO algorithm that achieves high accuracy while improving efficiency is introduced. To accurately capture both the local details and overall postural features of DDB, an innovative Neck structure called MHMS is designed along with a new feature extraction module referred to as SGHCB. To further optimize model efficiency, this paper presents an efficient spatial-reorganization upsampling (ESU) module and a novel Shared Convolution Detection head (SCDetection). ESU restructures feature information across channel and spatial dimensions through channel shuffle and spatial shift, with a significant reduction in computational complexity and loss of feature information. By introducing a dedicated detection head branch for huge targets and sharing convolutional parameters across all four branches, SCDetection achieves enhanced detection capability for oversized objects and greater computational efficiency. Additionally, an adaptive dynamic label assignment strategy is developed to enhance the discriminative ability of both high-confidence class predictions and precisely regressed bounding box coordinates, thereby improving recognition accuracy. Moreover, a novel channel pruning method termed DG-LAMP is proposed to significantly reduce the computational cost of the model. Then knowledge distillation is implemented to compensate for the accuracy loss. Experimental results reveal that on the 100-Drivers dataset, most existing lightweight classification algorithms underperform, achieving classification accuracies of only 70% to 80%, and fail to classify multiple DDB occurring at the same time. The DDR-YOLO achieves accuracies of 91.6% and 88.8% on RGB and near-infrared modalities with a computational cost of 1.2 GFLOPs, a parameter count of 0.45M and approximately 2000 FPS. In addition, generalization experiments conducted on the StateFarm dataset and our self-collected dataset achieve accuracies of 44.3% and 87.6%, respectively. Furthermore, the proposed algorithm is deployed on an NVIDIA Jetson Orin Nano 8GB platform for practical validation. In high-power mode, DDR-YOLO runs stably for extended periods with the FPS remaining at around 29, and the operating temperature stays within a normal range. These results confirm that the proposed algorithm shows outst
近年来,研究人员采用图像分类和目标检测方法来识别分心驾驶行为。然而,对这两种方法在分心驾驶行为识别(DDR)领域的全面比较分析仍未得到充分探讨,导致大多数现有算法难以平衡效率和准确性。因此,本文在对这两种方法进行比较分析的基础上,提出了一种受YOLO11启发的新型DDR算法DDR- yolo。最初,本文使用来自100-Drivers数据集的25万张手动标记的图像探索了在DDR中表现更好的方法。在此基础上,介绍了在提高效率的同时实现高精度的轻量级DDR-YOLO算法。为了准确地捕捉DDB的局部细节和整体姿势特征,设计了一种名为MHMS的创新颈部结构,以及一种名为SGHCB的新特征提取模块。为了进一步优化模型效率,本文提出了一种高效的空间重组上采样(ESU)模块和一种新的共享卷积检测头(SCDetection)。ESU通过通道shuffle和空间移位跨通道和空间维度重构特征信息,显著降低了计算复杂度和特征信息的损失。通过为大型目标引入专用检测头分支,并在所有四个分支之间共享卷积参数,SCDetection实现了对超大目标的增强检测能力和更高的计算效率。此外,开发了一种自适应动态标签分配策略,以增强高置信度类预测和精确回归的边界框坐标的判别能力,从而提高识别精度。此外,提出了一种新的通道剪枝方法DG-LAMP,大大降低了模型的计算成本。在此基础上,通过知识精馏来弥补精度损失。实验结果表明,在100-Drivers数据集上,大多数现有的轻量级分类算法表现不佳,分类准确率仅为70% ~ 80%,并且无法对同时发生的多个DDB进行分类。DDR-YOLO在RGB和近红外模式下的精度分别为91.6%和88.8%,计算成本为1.2 GFLOPs,参数计数为0.45M,约为2000 FPS。此外,在StateFarm数据集和我们自己收集的数据集上进行的概化实验,准确率分别达到44.3%和87.6%。并在NVIDIA Jetson Orin Nano 8GB平台上进行了实际验证。在高功率模式下,DDR-YOLO长时间稳定运行,FPS保持在29左右,工作温度保持在正常范围内。这些结果证实了该算法在保持较高精度的同时,在模型大小和实时性方面表现出了出色的性能。
{"title":"DDR-YOLO: An efficient and accurate object detection algorithm for distracted driving behaviors","authors":"Qian Shen , Lei Zhang , Yan Zhang , Yuxiang Zhang , Shihao Liu , Yi Li","doi":"10.1016/j.eswa.2026.131170","DOIUrl":"10.1016/j.eswa.2026.131170","url":null,"abstract":"<div><div>In recent years, researchers have employed image classification and object detection methods to recognize distracted driving behaviors (DDB). Nevertheless, a comprehensive comparative analysis of these two methods within the realm of distracted driving behavior recognition (DDR) remains underexplored, resulting in most existing algorithms struggling to balance efficiency and accuracy. Therefore, based on a comparative analysis of these two methods, this paper proposes a novel DDR algorithm named DDR-YOLO inspired by YOLO11. Initially, this paper explores the method that performs better in DDR using 250,000 manually labeled images from the 100-Drivers dataset. Furthermore, the lightweight DDR-YOLO algorithm that achieves high accuracy while improving efficiency is introduced. To accurately capture both the local details and overall postural features of DDB, an innovative Neck structure called MHMS is designed along with a new feature extraction module referred to as SGHCB. To further optimize model efficiency, this paper presents an efficient spatial-reorganization upsampling (ESU) module and a novel Shared Convolution Detection head (SCDetection). ESU restructures feature information across channel and spatial dimensions through channel shuffle and spatial shift, with a significant reduction in computational complexity and loss of feature information. By introducing a dedicated detection head branch for huge targets and sharing convolutional parameters across all four branches, SCDetection achieves enhanced detection capability for oversized objects and greater computational efficiency. Additionally, an adaptive dynamic label assignment strategy is developed to enhance the discriminative ability of both high-confidence class predictions and precisely regressed bounding box coordinates, thereby improving recognition accuracy. Moreover, a novel channel pruning method termed DG-LAMP is proposed to significantly reduce the computational cost of the model. Then knowledge distillation is implemented to compensate for the accuracy loss. Experimental results reveal that on the 100-Drivers dataset, most existing lightweight classification algorithms underperform, achieving classification accuracies of only 70% to 80%, and fail to classify multiple DDB occurring at the same time. The DDR-YOLO achieves accuracies of 91.6% and 88.8% on RGB and near-infrared modalities with a computational cost of 1.2 GFLOPs, a parameter count of 0.45M and approximately 2000 FPS. In addition, generalization experiments conducted on the StateFarm dataset and our self-collected dataset achieve accuracies of 44.3% and 87.6%, respectively. Furthermore, the proposed algorithm is deployed on an NVIDIA Jetson Orin Nano 8GB platform for practical validation. In high-power mode, DDR-YOLO runs stably for extended periods with the FPS remaining at around 29, and the operating temperature stays within a normal range. These results confirm that the proposed algorithm shows outst","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"308 ","pages":"Article 131170"},"PeriodicalIF":7.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-11DOI: 10.1016/j.eswa.2026.131171
Zhen Wang , Yuqing Gao , Yuzhen Chen , Zheng Lu
Active and intelligent control of structural seismic response has become a key approach for enhancing the safety and resilience of civil infrastructure. However, traditional model-based active controllers typically rely on accurate structural models and fixed control laws, which may perform poorly under complex real-world conditions. To address these limitations, this study proposes SeisMind, an end-to-end seismic control framework in which control policies are optimized using deep reinforcement learning. To handle unpredictable seismic excitations, a stochastic training environment is established by systematically sampling a wide range of near-field and far-field earthquake records, capturing diverse frequency contents, intensity levels, and seismic characteristics. To improve the robustness of the control strategy, structural uncertainties, particularly stiffness degradation, are incorporated via domain randomization during training. In addition, a physically interpretable reward function is designed to integrate structural response indicators capturing both structural and non-structural damage, as well as actuator effort. Numerical experiments demonstrate that SeisMind achieves effective control performance across linear and nonlinear structural systems, maintaining stable performance even under structural degradation. Across multiple seismic intensity levels, SeisMind exhibits more stable, self-adaptive, and less variable control performance across a range of ground motion records, outperforming conventional Linear Quadratic Regulator and H∞ robust controllers, thereby highlighting its potential as a scalable and generalizable solution for next-generation intelligent seismic control of civil infrastructure.
{"title":"SeisMind: A domain-knowledge-informed reinforcement learning framework for intelligent control of structural seismic response","authors":"Zhen Wang , Yuqing Gao , Yuzhen Chen , Zheng Lu","doi":"10.1016/j.eswa.2026.131171","DOIUrl":"10.1016/j.eswa.2026.131171","url":null,"abstract":"<div><div>Active and intelligent control of structural seismic response has become a key approach for enhancing the safety and resilience of civil infrastructure. However, traditional model-based active controllers typically rely on accurate structural models and fixed control laws, which may perform poorly under complex real-world conditions. To address these limitations, this study proposes SeisMind, an end-to-end seismic control framework in which control policies are optimized using deep reinforcement learning. To handle unpredictable seismic excitations, a stochastic training environment is established by systematically sampling a wide range of near-field and far-field earthquake records, capturing diverse frequency contents, intensity levels, and seismic characteristics. To improve the robustness of the control strategy, structural uncertainties, particularly stiffness degradation, are incorporated via domain randomization during training. In addition, a physically interpretable reward function is designed to integrate structural response indicators capturing both structural and non-structural damage, as well as actuator effort. Numerical experiments demonstrate that SeisMind achieves effective control performance across linear and nonlinear structural systems, maintaining stable performance even under structural degradation. Across multiple seismic intensity levels, SeisMind exhibits more stable, self-adaptive, and less variable control performance across a range of ground motion records, outperforming conventional Linear Quadratic Regulator and <em>H</em><sub>∞</sub> robust controllers, thereby highlighting its potential as a scalable and generalizable solution for next-generation intelligent seismic control of civil infrastructure.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"309 ","pages":"Article 131171"},"PeriodicalIF":7.5,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}