首页 > 最新文献

Engineering Applications of Artificial Intelligence最新文献

英文 中文
An innovative user-to-device authentication scheme using broad learning-based dynamic hint generation 使用基于广泛学习的动态提示生成的创新用户到设备认证方案
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-16 DOI: 10.1016/j.engappai.2025.113664
Milad Taleby Ahvanooey , Wojciech Mazurczyk
The Personal Identification Number (PIN) authentication scheme is still a broadly deployed standard security protocol for smart devices due to its simplicity and usability in Internet of Everything (IoE) environments. However, the classical PIN schemes are technically susceptible to side-channel attacks, where adversaries can capture the victims’ PINs through camera-based recording, keystroke logging, or visual watching of User-to-Device (U2D) interactions. To overcome this critical security flaw, we introduce an innovative Dynamical PIN Hiding (HDynPIN) multifactor authentication scheme for protecting IoE machines, which functions by concealing a Hidden-PIN (HP) under the guise of a Dynamic-Passcode (DPC) based on a Recurrent Neural Network (RNN)-generated hint item and a randomized entry pathway. HDynPIN requires the user to choose a 4- or 6-digit HP, a set of hint items, and their corresponding operators during the registration phase. Then, it displays a random hint item generated using a broad learning-based RNN algorithm, considering the user’s settings, which guides her/him through a randomized entry pathway by utilizing a one-time valid DPC during the authentication phase. By concealing the HP and randomizing the DPC entry pathway, HDynPIN provides a user-friendly and more secure U2D protocol that is robust against side-channel attacks. Our extensive experimental evaluation confirms that HDynPIN provides better performance compared to state-of-the-art schemes.
个人识别号码(PIN)认证方案由于其在万物互联(IoE)环境中的简单性和可用性,仍然是智能设备广泛部署的标准安全协议。然而,经典的PIN方案在技术上容易受到侧信道攻击,攻击者可以通过基于摄像头的记录、击键记录或用户对设备(U2D)交互的视觉观察来捕获受害者的PIN。为了克服这一关键的安全漏洞,我们引入了一种创新的动态PIN隐藏(HDynPIN)多因素认证方案来保护IoE机器,该方案基于循环神经网络(RNN)生成的提示项和随机进入路径,在动态密码(DPC)的幌子下隐藏隐藏PIN (HP)。HDynPIN要求用户在注册阶段选择4位或6位HP、一组提示项及其对应的操作符。然后,它显示使用基于广泛学习的RNN算法生成的随机提示项,考虑到用户的设置,该算法通过在认证阶段利用一次性有效的DPC引导她/他通过随机进入路径。通过隐藏HP和随机化DPC入口路径,HDynPIN提供了一个用户友好且更安全的U2D协议,可抵御侧信道攻击。我们广泛的实验评估证实,与最先进的方案相比,HDynPIN提供了更好的性能。
{"title":"An innovative user-to-device authentication scheme using broad learning-based dynamic hint generation","authors":"Milad Taleby Ahvanooey ,&nbsp;Wojciech Mazurczyk","doi":"10.1016/j.engappai.2025.113664","DOIUrl":"10.1016/j.engappai.2025.113664","url":null,"abstract":"<div><div>The Personal Identification Number (PIN) authentication scheme is still a broadly deployed standard security protocol for smart devices due to its simplicity and usability in Internet of Everything (IoE) environments. However, the classical PIN schemes are technically susceptible to side-channel attacks, where adversaries can capture the victims’ PINs through camera-based recording, keystroke logging, or visual watching of User-to-Device (U2D) interactions. To overcome this critical security flaw, we introduce an innovative Dynamical PIN Hiding (HDynPIN) multifactor authentication scheme for protecting IoE machines, which functions by concealing a Hidden-PIN (HP) under the guise of a Dynamic-Passcode (DPC) based on a Recurrent Neural Network (RNN)-generated hint item and a randomized entry pathway. HDynPIN requires the user to choose a 4- or 6-digit HP, a set of hint items, and their corresponding operators during the registration phase. Then, it displays a random hint item generated using a broad learning-based RNN algorithm, considering the user’s settings, which guides her/him through a randomized entry pathway by utilizing a one-time valid DPC during the authentication phase. By concealing the HP and randomizing the DPC entry pathway, HDynPIN provides a user-friendly and more secure U2D protocol that is robust against side-channel attacks. Our extensive experimental evaluation confirms that HDynPIN provides better performance compared to state-of-the-art schemes.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113664"},"PeriodicalIF":8.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Imbalanced fault diagnosis of electromechanical systems under unseen operating conditions: a heterogeneous domain generalization framework combining digital twin knowledge and data 未知工况下机电系统不平衡故障诊断:数字孪生知识与数据相结合的异构域泛化框架
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-16 DOI: 10.1016/j.engappai.2026.113822
Xuanyuan Su , Kaixin Jin , Yongzhe Ma , Chen Lu , Laifa Tao
Imbalanced data and diverse operating conditions (OCs) are two common issues in fault diagnosis, which are generally addressed by data-driven artificial intelligence (AI) variants focusing on data generation and transfer learning. However, applying these approaches to complex electromechanical systems (EMS) remains challenging, as extended faults and diverse OCs create harsh data situations, such as scarcity of fault data and unseen OCs, thus limiting the efficacy of purely data-driven paradigms. This paper proposes a data and knowledge-combined intelligent fault diagnosis framework. Firstly, a collaborative hierarchical modeling mechanism is proposed to construct a full-system digital twin (DT) for EMS, which generates two modalities of information: DT fault data and DT knowledge, enriching both the scale and type of the available dataset. Furthermore, a heterogeneous domain generalization network (HDGN) is proposed to achieve generalized fault diagnosis from both data and knowledge perspectives. By embedding prior DT knowledge, domain-invariance is stably retained from the data. Driven by the triplet specific similarity loss, domain-specific discriminative representations are adaptively learned by multi-channels from the knowledge-embedded data. The resulting HDGN progressively improves model generalization to unseen OCs with well-balanced stability and adaptiveness. The experimental results demonstrate the proposed method's effectiveness and superiority, providing a reference for AI applications in industrial scenarios with imbalanced data and unseen OCs.
数据不平衡和不同的运行状态(oc)是故障诊断中的两个常见问题,通常通过数据驱动的人工智能(AI)变体来解决,重点是数据生成和迁移学习。然而,将这些方法应用于复杂的机电系统(EMS)仍然具有挑战性,因为扩展的故障和不同的OCs创建了苛刻的数据情况,例如故障数据的稀缺性和不可见的OCs,从而限制了纯数据驱动范式的有效性。提出了一种数据与知识相结合的智能故障诊断框架。首先,提出一种协同分层建模机制,构建EMS全系统数字孪生(DT),生成DT故障数据和DT知识两种信息模态,丰富了可用数据集的规模和类型;在此基础上,提出了一种异构域泛化网络(HDGN),从数据和知识两方面实现广义故障诊断。通过嵌入先验DT知识,可以稳定地保持数据的域不变性。在三元组特定相似度损失的驱动下,通过多通道自适应地从知识嵌入数据中学习特定领域的判别表示。由此产生的HDGN逐步提高了模型对未知oc的泛化能力,具有良好的稳定性和自适应性。实验结果证明了该方法的有效性和优越性,为人工智能在数据不平衡、oc不可见的工业场景下的应用提供了参考。
{"title":"Imbalanced fault diagnosis of electromechanical systems under unseen operating conditions: a heterogeneous domain generalization framework combining digital twin knowledge and data","authors":"Xuanyuan Su ,&nbsp;Kaixin Jin ,&nbsp;Yongzhe Ma ,&nbsp;Chen Lu ,&nbsp;Laifa Tao","doi":"10.1016/j.engappai.2026.113822","DOIUrl":"10.1016/j.engappai.2026.113822","url":null,"abstract":"<div><div>Imbalanced data and diverse operating conditions (OCs) are two common issues in fault diagnosis, which are generally addressed by data-driven artificial intelligence (AI) variants focusing on data generation and transfer learning. However, applying these approaches to complex electromechanical systems (EMS) remains challenging, as extended faults and diverse OCs create harsh data situations, such as scarcity of fault data and unseen OCs, thus limiting the efficacy of purely data-driven paradigms. This paper proposes a data and knowledge-combined intelligent fault diagnosis framework. Firstly, a collaborative hierarchical modeling mechanism is proposed to construct a full-system digital twin (DT) for EMS, which generates two modalities of information: DT fault data and DT knowledge, enriching both the scale and type of the available dataset. Furthermore, a heterogeneous domain generalization network (HDGN) is proposed to achieve generalized fault diagnosis from both data and knowledge perspectives. By embedding prior DT knowledge, domain-invariance is stably retained from the data. Driven by the triplet specific similarity loss, domain-specific discriminative representations are adaptively learned by multi-channels from the knowledge-embedded data. The resulting HDGN progressively improves model generalization to unseen OCs with well-balanced stability and adaptiveness. The experimental results demonstrate the proposed method's effectiveness and superiority, providing a reference for AI applications in industrial scenarios with imbalanced data and unseen OCs.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113822"},"PeriodicalIF":8.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel causal relationship-based evidential reasoning prediction method with a time parameter 一种新的基于因果关系的时间参数证据推理预测方法
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-16 DOI: 10.1016/j.engappai.2025.113669
Shanshan Liu , Liang Chang , Guanyu Hu , Guanghai Li
Predictive methods for complex engineering systems are critical for decision-making, but most rely on data fitting and fail to capture causal relationships amid ignorance and uncertainty. Traditional evidential reasoning (ER) addresses such system characteristics yet lacks causal mining and predictive capabilities. This paper proposes a novel temporal sequential ER prediction model that retains ER's strengths in handling uncertainty while mining and quantifying causality between evidence and conclusions via the improved Peter–Clark (PC) algorithm and transfer entropy. Incorporating a time parameter t enables predicting subsequent system states, enhancing accuracy and reliability. Validated across three industrial system domains, the model achieves remarkable performance: 93.78 % accuracy in short-term power load prediction (8.3 %–16.87 % improvement over baselines), 96.5 % in software defined networking (SDN) security prediction (10 %–14.96 % enhancement), and 93.65 % in flywheel system fault prediction (19.65 %–29.65 % improvement). These results confirm its practical value in boosting grid efficiency, strengthening network security, and improving equipment reliability for complex engineering systems.
复杂工程系统的预测方法对决策至关重要,但大多数依赖于数据拟合,在无知和不确定性中无法捕捉因果关系。传统的证据推理(ER)解决了这些系统特征,但缺乏因果挖掘和预测能力。本文提出了一种新的时间序列ER预测模型,该模型保留了ER在处理不确定性方面的优势,同时通过改进的Peter-Clark (PC)算法和传递熵挖掘和量化证据与结论之间的因果关系。结合时间参数t可以预测后续系统状态,提高准确性和可靠性。在三个工业系统领域进行了验证,该模型取得了显著的性能:短期电力负荷预测准确率为93.78%(比基线提高8.3% ~ 16.87%),软件定义网络(SDN)安全预测准确率为96.5%(提高10% ~ 14.96%),飞轮系统故障预测准确率为93.65%(提高19.65% ~ 29.65%)。这些结果证实了该方法在提高电网效率、加强网络安全、提高复杂工程系统设备可靠性等方面的实用价值。
{"title":"A novel causal relationship-based evidential reasoning prediction method with a time parameter","authors":"Shanshan Liu ,&nbsp;Liang Chang ,&nbsp;Guanyu Hu ,&nbsp;Guanghai Li","doi":"10.1016/j.engappai.2025.113669","DOIUrl":"10.1016/j.engappai.2025.113669","url":null,"abstract":"<div><div>Predictive methods for complex engineering systems are critical for decision-making, but most rely on data fitting and fail to capture causal relationships amid ignorance and uncertainty. Traditional evidential reasoning (ER) addresses such system characteristics yet lacks causal mining and predictive capabilities. This paper proposes a novel temporal sequential ER prediction model that retains ER's strengths in handling uncertainty while mining and quantifying causality between evidence and conclusions via the improved Peter–Clark (PC) algorithm and transfer entropy. Incorporating a time parameter <em>t</em> enables predicting subsequent system states, enhancing accuracy and reliability. Validated across three industrial system domains, the model achieves remarkable performance: 93.78 % accuracy in short-term power load prediction (8.3 %–16.87 % improvement over baselines), 96.5 % in software defined networking (SDN) security prediction (10 %–14.96 % enhancement), and 93.65 % in flywheel system fault prediction (19.65 %–29.65 % improvement). These results confirm its practical value in boosting grid efficiency, strengthening network security, and improving equipment reliability for complex engineering systems.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113669"},"PeriodicalIF":8.0,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artificial Intelligence-based back-calculation model for scrap compiling optimization 基于人工智能的废品整理优化反算模型
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.engappai.2026.113809
Michael Schäfer , Ulrike Faltings , Björn Glaser
Scrap is the most important secondary raw material in the transformation to low carbon dioxide (CO2) steel. However, the suitable use of different scrap types for producing high quality steels with the right chemical composition is non-trivial. It requires process control and detailed knowledge of all input materials used. SHapley Additive exPlanations (SHAP), a game-theoretic approach, is often used to interpret machine learning models through visualizations and feature attributions. In this paper, we present a novel application of SHAP values. This enables more precise control of material composition in steel production without the need for additional sensors. This makes it extremely practical for real steel production environments and enables better control of the materials used in the steel production process.
As a basis for this approach, various machine learning models were trained and the respective SHAP values computed. To validate the approach, the results were compared with the values from the steel plant. Comparing the calculated values with the historical estimates, the results agree for most input materials and target elements. The key innovation lies in using SHAP values not only for model interpretability, but also as a quantitative tool to estimate the chemical content of input materials (e.g., steel scrap) based on process data. The framework enables chemical composition estimation, relying solely on routinely collected process data. This is a novel application of SHAP and allows the back-calculation of predicted values and can be used in a wide range of applications in industry and academia.
废钢是炼钢过程中最重要的二次原料。然而,适当地使用不同类型的废钢来生产具有正确化学成分的高质量钢是非常重要的。它需要过程控制和对所用所有输入材料的详细了解。SHapley加性解释(SHAP)是一种博弈论方法,通常用于通过可视化和特征归因来解释机器学习模型。在本文中,我们提出了SHAP值的一个新的应用。这使得在钢铁生产中更精确地控制材料成分,而无需额外的传感器。这使得它在真实的钢铁生产环境中非常实用,并且能够更好地控制钢铁生产过程中使用的材料。作为该方法的基础,训练了各种机器学习模型并计算了各自的SHAP值。为了验证该方法,将结果与炼钢厂的数值进行了比较。将计算值与历史估计值进行比较,结果与大多数输入材料和目标元素一致。关键的创新在于使用SHAP值不仅是为了模型的可解释性,而且还作为一种定量工具,根据过程数据估计输入材料(例如废钢)的化学含量。该框架使化学成分的估计,完全依赖于常规收集的过程数据。这是一种新颖的SHAP应用,允许对预测值进行反向计算,可以在工业和学术界广泛应用。
{"title":"Artificial Intelligence-based back-calculation model for scrap compiling optimization","authors":"Michael Schäfer ,&nbsp;Ulrike Faltings ,&nbsp;Björn Glaser","doi":"10.1016/j.engappai.2026.113809","DOIUrl":"10.1016/j.engappai.2026.113809","url":null,"abstract":"<div><div>Scrap is the most important secondary raw material in the transformation to low carbon dioxide (<span><math><msub><mrow><mi>CO</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>) steel. However, the suitable use of different scrap types for producing high quality steels with the right chemical composition is non-trivial. It requires process control and detailed knowledge of all input materials used. SHapley Additive exPlanations (SHAP), a game-theoretic approach, is often used to interpret machine learning models through visualizations and feature attributions. In this paper, we present a novel application of SHAP values. This enables more precise control of material composition in steel production without the need for additional sensors. This makes it extremely practical for real steel production environments and enables better control of the materials used in the steel production process.</div><div>As a basis for this approach, various machine learning models were trained and the respective SHAP values computed. To validate the approach, the results were compared with the values from the steel plant. Comparing the calculated values with the historical estimates, the results agree for most input materials and target elements. The key innovation lies in using SHAP values not only for model interpretability, but also as a quantitative tool to estimate the chemical content of input materials (e.g., steel scrap) based on process data. The framework enables chemical composition estimation, relying solely on routinely collected process data. This is a novel application of SHAP and allows the back-calculation of predicted values and can be used in a wide range of applications in industry and academia.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113809"},"PeriodicalIF":8.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-view feature learning and enhanced hypergraph neural networks for synergistic prediction of drug combination 多视图特征学习和增强超图神经网络协同预测药物联合
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.engappai.2026.113863
Wei Wang , Mengyi Ma , Hongjun Zhang , Yun Zhou , Guangsheng Wu
Drug combination therapy demonstrates more significant efficacy than monotherapy in cancer treatment. Despite the proposal of several computational approaches aimed at effectively identifying synergistic drug combinations, challenges persist due to inadequate multi-level learning within multimodal data. Furthermore, existing models still struggle to adequately capture the complex biological network interactions between drug combinations and cell lines. To overcome these issues, we propose a novel hypergraph neural network method for synergistic drug combination prediction. This method integrates multi-view feature learning and enhanced hypergraph neural networks to improve drug combination prediction. First, multi-view learning is independently applied to the multimodal data of drugs and cell lines. This framework employs a fine-tuned ChemBERTa model enhanced by contrastive learning to effectively capture the contextual information of drug SMILES. Second, enhanced hypergraph neural networks equipped with a multi-head attention mechanism are designed to capture the complex topological information between drugs and cell lines and to address the limited ability of the hypergraph to capture global information. Third, the similarity-based multi-task supervision module further stabilizes the model. The experimental results show that our method outperforms state-of-the-art methods in various scenarios, including leave-drug-combination-out, leave-cell-out, and leave-drug-out scenarios. Specifically, in the leave-drug combination-out scenario, our method achieves a Mean Squared Error of 163.635, a Root Mean Squared Error of 12.792, and a Pearson Correlation Coefficient of 0.751. Finally, a case study demonstrates the efficacy of the model in predicting novel synergistic drug combinations.
在肿瘤治疗中,药物联合治疗比单一治疗疗效更显著。尽管提出了几种旨在有效识别协同药物组合的计算方法,但由于多模态数据中的多层次学习不足,挑战仍然存在。此外,现有的模型仍然难以充分捕捉药物组合和细胞系之间复杂的生物网络相互作用。为了克服这些问题,我们提出了一种新的超图神经网络方法用于协同药物联合预测。该方法将多视图特征学习与增强型超图神经网络相结合,提高了药物联合预测能力。首先,将多视图学习独立应用于药物和细胞系的多模态数据。该框架采用了经过对比学习增强的精细ChemBERTa模型,以有效地捕获药物SMILES的上下文信息。其次,设计了具有多头注意机制的增强型超图神经网络,以捕获药物和细胞系之间复杂的拓扑信息,并解决超图捕获全局信息的能力有限的问题。第三,基于相似性的多任务监督模块进一步稳定了模型。实验结果表明,我们的方法在各种情况下都优于最先进的方法,包括药物组合、细胞和药物的情况。具体而言,在离开药物组合的情况下,我们的方法的均方误差为163.635,均方根误差为12.792,Pearson相关系数为0.751。最后,一个案例研究证明了该模型在预测新型协同药物组合方面的有效性。
{"title":"Multi-view feature learning and enhanced hypergraph neural networks for synergistic prediction of drug combination","authors":"Wei Wang ,&nbsp;Mengyi Ma ,&nbsp;Hongjun Zhang ,&nbsp;Yun Zhou ,&nbsp;Guangsheng Wu","doi":"10.1016/j.engappai.2026.113863","DOIUrl":"10.1016/j.engappai.2026.113863","url":null,"abstract":"<div><div>Drug combination therapy demonstrates more significant efficacy than monotherapy in cancer treatment. Despite the proposal of several computational approaches aimed at effectively identifying synergistic drug combinations, challenges persist due to inadequate multi-level learning within multimodal data. Furthermore, existing models still struggle to adequately capture the complex biological network interactions between drug combinations and cell lines. To overcome these issues, we propose a novel hypergraph neural network method for synergistic drug combination prediction. This method integrates multi-view feature learning and enhanced hypergraph neural networks to improve drug combination prediction. First, multi-view learning is independently applied to the multimodal data of drugs and cell lines. This framework employs a fine-tuned ChemBERTa model enhanced by contrastive learning to effectively capture the contextual information of drug SMILES. Second, enhanced hypergraph neural networks equipped with a multi-head attention mechanism are designed to capture the complex topological information between drugs and cell lines and to address the limited ability of the hypergraph to capture global information. Third, the similarity-based multi-task supervision module further stabilizes the model. The experimental results show that our method outperforms state-of-the-art methods in various scenarios, including leave-drug-combination-out, leave-cell-out, and leave-drug-out scenarios. Specifically, in the leave-drug combination-out scenario, our method achieves a Mean Squared Error of 163.635, a Root Mean Squared Error of 12.792, and a Pearson Correlation Coefficient of 0.751. Finally, a case study demonstrates the efficacy of the model in predicting novel synergistic drug combinations.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113863"},"PeriodicalIF":8.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a physics-guided bidirectional long short-term memory for wind power forecasting 风力发电预测物理导向双向长短期记忆的发展
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.engappai.2026.113798
Kai Sun, Dongzhe Yang, Dasong Wang, Fangfang Zhang
Accurate forecasting of wind power is essential to prevent grid overload and minimize power wastage, thereby optimizing dispatch and reducing the operational costs of power systems. However, the intermittent and unpredictable nature of wind energy poses significant challenges in achieving timely and precise predictions. To address these challenges, this study proposes a hybrid wind power forecasting model integrating physical knowledge with a bidirectional long short-term memory (BiLSTM) network. First, data collected from a practical wind farm are preprocessed and resampled to mitigate the impact of measurement outliers stemming from sensor faults and turbulence. Second, mechanistic model identification for the studied wind turbine is conducted to encode the relevant physical knowledge into the BiLSTM model. Third, a wind-speed-based probabilistic penalty term is designed to address physically implausible predictions under low-wind-speed conditions. Moreover, an improved leaky rectified linear unit activation function is proposed to refine the BiLSTM model, preventing both negative power predictions and those exceeding the rated capacity. Finally, the developed model is applied to real-world wind turbines. Experimental results demonstrate that the proposed model can effectively eliminate physically implausible predictions, and exhibit superior robustness and enhanced prediction accuracy compared with other advanced algorithms.
准确的风电预测对防止电网过载、减少电力浪费、优化调度、降低电力系统运行成本至关重要。然而,风能的间歇性和不可预测性给实现及时和精确的预测带来了重大挑战。为了应对这些挑战,本研究提出了一种将物理知识与双向长短期记忆(BiLSTM)网络相结合的混合风力发电预测模型。首先,从实际风电场收集的数据进行预处理和重新采样,以减轻由传感器故障和湍流引起的测量异常值的影响。其次,对所研究的风力机进行机理模型辨识,将相关物理知识编码到BiLSTM模型中。第三,设计了一个基于风速的概率惩罚项,以解决低风速条件下物理上不可信的预测。此外,提出了一种改进的泄漏整流线性单元激活函数来改进BiLSTM模型,以防止负功率预测和超过额定容量的预测。最后,将所建立的模型应用于实际的风力涡轮机。实验结果表明,与其他先进算法相比,该模型能够有效地消除物理上不合理的预测,具有更好的鲁棒性和更高的预测精度。
{"title":"Development of a physics-guided bidirectional long short-term memory for wind power forecasting","authors":"Kai Sun,&nbsp;Dongzhe Yang,&nbsp;Dasong Wang,&nbsp;Fangfang Zhang","doi":"10.1016/j.engappai.2026.113798","DOIUrl":"10.1016/j.engappai.2026.113798","url":null,"abstract":"<div><div>Accurate forecasting of wind power is essential to prevent grid overload and minimize power wastage, thereby optimizing dispatch and reducing the operational costs of power systems. However, the intermittent and unpredictable nature of wind energy poses significant challenges in achieving timely and precise predictions. To address these challenges, this study proposes a hybrid wind power forecasting model integrating physical knowledge with a bidirectional long short-term memory (BiLSTM) network. First, data collected from a practical wind farm are preprocessed and resampled to mitigate the impact of measurement outliers stemming from sensor faults and turbulence. Second, mechanistic model identification for the studied wind turbine is conducted to encode the relevant physical knowledge into the BiLSTM model. Third, a wind-speed-based probabilistic penalty term is designed to address physically implausible predictions under low-wind-speed conditions. Moreover, an improved leaky rectified linear unit activation function is proposed to refine the BiLSTM model, preventing both negative power predictions and those exceeding the rated capacity. Finally, the developed model is applied to real-world wind turbines. Experimental results demonstrate that the proposed model can effectively eliminate physically implausible predictions, and exhibit superior robustness and enhanced prediction accuracy compared with other advanced algorithms.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113798"},"PeriodicalIF":8.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dinspector: Dual factor graph attention mechanism for Advanced Persistent Threat detection 用于高级持续威胁检测的双因素图注意机制
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.engappai.2026.113861
Hongchao Wang, Wen Chen, Linrui Li, Haoyang Pu, Yilin Zhang
Advanced Persistent Threat (APT) has greatly threatened global cybersecurity. In recent years, a promising approach for APT detection has been proposed based on Graph Neural Networks (GNN) and provenance graphs constructed from host logs, in which graph nodes and edges represent processes and interactions, respectively. However, current methods primarily focus on the individual behaviors of single nodes (attacking subjects), neglecting the in-depth analysis of interactions between collaborative attacking processes. This results in limited capability in detecting complex and composite APT attacks. In this paper, a new GNN-based APT detection method, Dinspector, is proposed. It utilizes a dual factor attention mechanism to aggregate the features between neighboring nodes and edges simultaneously. Furthermore, Dinspector combines GraphSAGE (Sample and Aggregate) and the graph attention layer into a two-layer Graph Neural Network structure. By integrating node features, structural features, and neighbor features, Dinspector is capable of extracting features of complex attack patterns, improving the detection performance of novel APT attacks. Experimental results on three public datasets demonstrated that Dinspector achieves an average precision of 98% and a recall rate of 99%, attaining state-of-the-art detection performance and outperforming them in certain aspects. The source code of Dinspector is publicly available at: https://github.com/Qc-TX/Dinspector.
高级持续威胁(APT)严重威胁着全球网络安全。近年来,基于图神经网络(GNN)和由主机日志构造的来源图提出了一种很有前途的APT检测方法,其中图节点和图边分别表示过程和交互。然而,目前的方法主要关注单个节点(攻击主体)的个体行为,忽略了对协同攻击过程之间相互作用的深入分析。这导致检测复杂和复合APT攻击的能力有限。本文提出了一种新的基于gnn的APT检测方法——Dinspector。它利用双因素关注机制同时聚合相邻节点和边缘之间的特征。此外,Dinspector将GraphSAGE (Sample and Aggregate)和图注意层结合成一个两层图神经网络结构。通过融合节点特征、结构特征和邻居特征,能够提取复杂攻击模式的特征,提高对新型APT攻击的检测性能。在三个公共数据集上的实验结果表明,Dinspector达到了98%的平均准确率和99%的召回率,达到了最先进的检测性能,并在某些方面优于他们。disspector的源代码可在https://github.com/Qc-TX/Dinspector公开获取。
{"title":"Dinspector: Dual factor graph attention mechanism for Advanced Persistent Threat detection","authors":"Hongchao Wang,&nbsp;Wen Chen,&nbsp;Linrui Li,&nbsp;Haoyang Pu,&nbsp;Yilin Zhang","doi":"10.1016/j.engappai.2026.113861","DOIUrl":"10.1016/j.engappai.2026.113861","url":null,"abstract":"<div><div>Advanced Persistent Threat (APT) has greatly threatened global cybersecurity. In recent years, a promising approach for APT detection has been proposed based on Graph Neural Networks (GNN) and provenance graphs constructed from host logs, in which graph nodes and edges represent processes and interactions, respectively. However, current methods primarily focus on the individual behaviors of single nodes (attacking subjects), neglecting the in-depth analysis of interactions between collaborative attacking processes. This results in limited capability in detecting complex and composite APT attacks. In this paper, a new GNN-based APT detection method, Dinspector, is proposed. It utilizes a dual factor attention mechanism to aggregate the features between neighboring nodes and edges simultaneously. Furthermore, Dinspector combines GraphSAGE (Sample and Aggregate) and the graph attention layer into a two-layer Graph Neural Network structure. By integrating node features, structural features, and neighbor features, Dinspector is capable of extracting features of complex attack patterns, improving the detection performance of novel APT attacks. Experimental results on three public datasets demonstrated that Dinspector achieves an average precision of 98% and a recall rate of 99%, attaining state-of-the-art detection performance and outperforming them in certain aspects. The source code of Dinspector is publicly available at: <span><span>https://github.com/Qc-TX/Dinspector</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113861"},"PeriodicalIF":8.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid machine learning and physical modeling framework for climate-driven risk zonation of concrete shrinkage damage 气候驱动下混凝土收缩损伤风险分区的混合机器学习和物理建模框架
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.engappai.2026.113788
Qiaosong Hu , Dujian Zou , Zhilin Bai , Tiejun Liu , Ao Zhou
Concrete shrinkage under non-stationary climatic forcing poses an increasing threat to the serviceability and longevity of infrastructure in low-pressure, arid and high-altitude regions. Current models neglect multi-environment interactions and climate-driven risk evolution. This study presents a hybrid modeling and assessment framework that coupled physics-informed empirical priors with optimized machine learning to predict shrinkage evolution, quantify structural risk, and map spatiotemporal vulnerability under future climate scenarios. A curated shrinkage database was fused with high-resolution meteorological projections and downscaled via filtering and cubic interpolation. The empirical CEB-FIP 2010 shrinkage formulation and air pressure parameters were embedded into feature engineering to create temperature-humidity-pressure coupled predictors. An XGBoost (Extreme Gradient Boosting) model was optimized through systematic hyperparameter tuning and physics-guided transfer learning. The optimized coupling model attained R2 = 0.92 to predict shrinkage evolution, and reduced long-term prediction divergence to within 15% against independent data from three-factor experiments. To translate material-level shrinkage into structural risk, multiphysics finite-element simulations of a representative reinforced-concrete pier incorporated eigenstrain shrinkage fields and reinforcement constraint to resolve strain–stress–damage progression. Four critical normalized strain thresholds were identified that demarcated initiation, stable propagation, accelerated expansion and through-crack stages. A five-tier risk zoning map across China was constructed, covering both historical data and mid-future climate scenario. Plateau and northwestern basins showed marked vulnerability. Using C60 concrete as a representative case study due to its prevalence, results showed the medium-to-high risk area increasing by 65%, with 31.1% of China's territory classified as medium–high risk by 2050.
在低压、干旱和高海拔地区,非固定气候强迫下的混凝土收缩对基础设施的使用能力和寿命构成越来越大的威胁。目前的模式忽略了多环境相互作用和气候驱动的风险演变。本研究提出了一个混合建模和评估框架,将物理知识的经验先验与优化的机器学习相结合,以预测未来气候情景下的收缩演变,量化结构风险,并绘制时空脆弱性图。一个精心设计的收缩数据库与高分辨率气象预测相融合,并通过滤波和立方插值进行缩小。经验CEB-FIP 2010收缩公式和空气压力参数被嵌入到特征工程中,以创建温度-湿度-压力耦合预测。通过系统超参数调优和物理引导迁移学习对XGBoost (Extreme Gradient Boosting)模型进行了优化。优化后的耦合模型预测收缩演化的R2 = 0.92,与三因素实验的独立数据相比,长期预测偏差减小到15%以内。为了将材料水平的收缩转化为结构风险,对具有代表性的钢筋混凝土桥墩进行了多物理场有限元模拟,将特征应变收缩场和钢筋约束结合起来,以解决应变-应力-损伤过程。确定了有界起始、稳定扩展、加速扩展和贯通阶段的归一化应变临界阈值。构建了覆盖历史数据和未来中期气候情景的中国五层风险分区图。高原和西北盆地脆弱程度明显。以C60混凝土作为代表性案例研究,由于其普遍存在,结果表明,到2050年,中高风险地区增加了65%,中国31.1%的领土被划分为中高风险地区。
{"title":"Hybrid machine learning and physical modeling framework for climate-driven risk zonation of concrete shrinkage damage","authors":"Qiaosong Hu ,&nbsp;Dujian Zou ,&nbsp;Zhilin Bai ,&nbsp;Tiejun Liu ,&nbsp;Ao Zhou","doi":"10.1016/j.engappai.2026.113788","DOIUrl":"10.1016/j.engappai.2026.113788","url":null,"abstract":"<div><div>Concrete shrinkage under non-stationary climatic forcing poses an increasing threat to the serviceability and longevity of infrastructure in low-pressure, arid and high-altitude regions. Current models neglect multi-environment interactions and climate-driven risk evolution. This study presents a hybrid modeling and assessment framework that coupled physics-informed empirical priors with optimized machine learning to predict shrinkage evolution, quantify structural risk, and map spatiotemporal vulnerability under future climate scenarios. A curated shrinkage database was fused with high-resolution meteorological projections and downscaled via filtering and cubic interpolation. The empirical CEB-FIP 2010 shrinkage formulation and air pressure parameters were embedded into feature engineering to create temperature-humidity-pressure coupled predictors. An XGBoost (Extreme Gradient Boosting) model was optimized through systematic hyperparameter tuning and physics-guided transfer learning. The optimized coupling model attained R<sup>2</sup> = 0.92 to predict shrinkage evolution, and reduced long-term prediction divergence to within 15% against independent data from three-factor experiments. To translate material-level shrinkage into structural risk, multiphysics finite-element simulations of a representative reinforced-concrete pier incorporated eigenstrain shrinkage fields and reinforcement constraint to resolve strain–stress–damage progression. Four critical normalized strain thresholds were identified that demarcated initiation, stable propagation, accelerated expansion and through-crack stages. A five-tier risk zoning map across China was constructed, covering both historical data and mid-future climate scenario. Plateau and northwestern basins showed marked vulnerability. Using C60 concrete as a representative case study due to its prevalence, results showed the medium-to-high risk area increasing by 65%, with 31.1% of China's territory classified as medium–high risk by 2050.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113788"},"PeriodicalIF":8.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-channel convolutional neural network for tomato pesticide residue detection using Gramian angular field transformations 基于Gramian角场变换的双通道卷积神经网络番茄农药残留检测
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-15 DOI: 10.1016/j.engappai.2026.113870
Yanshen Zhao, Yifan Zhao, Huayu Fu, Xiang Ji, Zhongzhi Han
Pesticide residues pose significant risks to food safety and human health, especially in crops like tomatoes, which rely heavily on pesticide use in greenhouse cultivation. Spectral detection techniques, although promising for their lossless nature, face challenges due to the limited feature information available from one-dimensional spectral data. To overcome this, we applied Gramian Angular Field (GAF) methods, including the Gramian Angular Difference Field (GADF) and Gramian Angular Summation Field (GASF), to transform the spectral data into two-dimensional representations, enhancing feature extraction. A Dual-Channel Convolutional Neural Network (DCCNN) was able to achieve an accuracy of 93.50 % on the tomato dataset. Additionally, using Principal Component Analysis (PCA) for key wavelength selection further improved performance: accuracy reached 89.77 % with 20 % of wavelengths and 93.68 % with 80 %. Generalizability tests conducted on apple and cucumber datasets resulted in accuracies of 91.28 % and 81.45 %, respectively. For the bacterial dataset and the aflatoxin B1 (AFB1) dataset, the model achieved performances of 90.83 % and 84.69 %, respectively. These findings highlight the effectiveness of GAF methods and DCCNN for pesticide residue detection in tomatoes, while also suggesting that further advancements in feature extraction and selection could broaden the application of these techniques to other agricultural crops.
农药残留对食品安全和人类健康构成重大风险,特别是在温室种植中严重依赖农药使用的西红柿等作物中。光谱检测技术虽然具有无损特性,但由于一维光谱数据的特征信息有限,因此面临着挑战。为了克服这一问题,我们采用格拉曼角场(graian Angular Field, GAF)方法,包括格拉曼角差分场(GADF)和格拉曼角和场(GASF),将光谱数据转换为二维表示,增强特征提取。双通道卷积神经网络(DCCNN)能够在番茄数据集上实现93.50%的准确率。此外,利用主成分分析(PCA)对关键波长的选择进一步提高了性能:20%波长的准确度达到89.77%,80%波长的准确度达到93.68%。在苹果和黄瓜数据集上进行的泛化测试结果表明,准确率分别为91.28%和81.45%。对于细菌数据集和黄曲霉毒素B1 (AFB1)数据集,该模型的性能分别达到90.83%和84.69%。这些发现突出了GAF方法和DCCNN在番茄农药残留检测中的有效性,同时也表明特征提取和选择的进一步进展可以扩大这些技术在其他农作物中的应用。
{"title":"Dual-channel convolutional neural network for tomato pesticide residue detection using Gramian angular field transformations","authors":"Yanshen Zhao,&nbsp;Yifan Zhao,&nbsp;Huayu Fu,&nbsp;Xiang Ji,&nbsp;Zhongzhi Han","doi":"10.1016/j.engappai.2026.113870","DOIUrl":"10.1016/j.engappai.2026.113870","url":null,"abstract":"<div><div>Pesticide residues pose significant risks to food safety and human health, especially in crops like tomatoes, which rely heavily on pesticide use in greenhouse cultivation. Spectral detection techniques, although promising for their lossless nature, face challenges due to the limited feature information available from one-dimensional spectral data. To overcome this, we applied Gramian Angular Field (GAF) methods, including the Gramian Angular Difference Field (GADF) and Gramian Angular Summation Field (GASF), to transform the spectral data into two-dimensional representations, enhancing feature extraction. A Dual-Channel Convolutional Neural Network (DCCNN) was able to achieve an accuracy of 93.50 % on the tomato dataset. Additionally, using Principal Component Analysis (PCA) for key wavelength selection further improved performance: accuracy reached 89.77 % with 20 % of wavelengths and 93.68 % with 80 %. Generalizability tests conducted on apple and cucumber datasets resulted in accuracies of 91.28 % and 81.45 %, respectively. For the bacterial dataset and the aflatoxin B1 (AFB1) dataset, the model achieved performances of 90.83 % and 84.69 %, respectively. These findings highlight the effectiveness of GAF methods and DCCNN for pesticide residue detection in tomatoes, while also suggesting that further advancements in feature extraction and selection could broaden the application of these techniques to other agricultural crops.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113870"},"PeriodicalIF":8.0,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LiteFormer: A lightweight encoder-only Transformer for efficient financial time series forecasting across global stock indices LiteFormer:一个轻量级的编码器转换器,用于有效地预测全球股票指数的金融时间序列
IF 8 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2026-01-14 DOI: 10.1016/j.engappai.2025.113681
Nguyen Quoc Anh , Tran Truong Tuan Phat , Ha Xuan Son , Thai Thi Thanh Nhan , Nguyen Ngoc Phien , Trung Phan Hoang Tuan , Ngan Nguyen Thi Kim
Financial time-series forecasting is challenged by non-linear, non-stationary dynamics driven by macroeconomic factors, market sentiment, and stochastic events. Traditional statistical models assume stationarity and linear dependencies, failing to capture complex temporal patterns, while deep learning approaches struggle with vanishing gradients and long-term dependencies. Standard Transformers incur high computational costs (quadratic complexity, O(n2d), per layer) due to attention mechanisms and large parameter counts, where n is the sequence length and d is the model dimension. This study proposes LiteFormer, a lightweight, encoder-only Transformer for univariate stock price forecasting, leveraging N=4 encoder layers with h=8 multi-head self-attention and feed-forward networks (dff=512). Operating on sequences of closing prices (T=14, dmodel=128), LiteFormer employs sinusoidal positional encodings, a causal mask, dropout (p=0.1), and layer normalization to model temporal dependencies and enhance generalization. With only 750,000+ parameters, LiteFormer reduces per layer complexity via compact design, thereby enabling low-latency inference (38 millisecond) and energy efficiency (96.894 Watt), which promises to offers scalable real-time inference for industrial fintech systems. Experiments across 30 stocks from the S&P 500, FTSE 100, and Nikkei 225 indices demonstrate Mean Absolute Error and Root Mean Square Error reductions of 3.45%–9.09% over vanilla Transformers and up to 48% over recurrence neural models for high-volatility stocks. LiteFormer’s efficient, interpretable architecture, driven by attention weights, offers a scalable solution with potential for multivariate extensions and real-world multi-modal applications in predictive domain.
金融时间序列预测受到宏观经济因素、市场情绪和随机事件驱动的非线性、非平稳动态的挑战。传统的统计模型假设平稳性和线性依赖性,无法捕捉复杂的时间模式,而深度学习方法则与消失的梯度和长期依赖性作斗争。由于注意机制和大量参数计数,标准变压器的计算成本很高(每层的二次复杂度为O(n2⋅d)),其中n为序列长度,d为模型维数。本研究提出了LiteFormer,这是一个轻量级的,仅用于单变量股票价格预测的编码器转换器,利用N=4编码器层和h=8多头自关注和前瞻网络(dff=512)。LiteFormer对收盘价序列(T=14, dmodel=128)进行操作,采用正弦位置编码、因果掩模、dropout (p=0.1)和层归一化来建模时间依赖性并增强泛化。LiteFormer只有75万多个参数,通过紧凑的设计降低了每层的复杂性,从而实现了低延迟推理(38毫秒)和能源效率(96.894瓦),这将为工业金融科技系统提供可扩展的实时推理。对标准普尔500指数、富时100指数和日经225指数中的30只股票进行的实验表明,与香草变形模型相比,平均绝对误差和均方根误差降低了3.45%-9.09%,与高波动性股票的递归神经模型相比,平均绝对误差降低了48%。LiteFormer的高效、可解释的架构由注意力权重驱动,提供了一个可扩展的解决方案,在预测领域具有多元扩展和实际多模态应用的潜力。
{"title":"LiteFormer: A lightweight encoder-only Transformer for efficient financial time series forecasting across global stock indices","authors":"Nguyen Quoc Anh ,&nbsp;Tran Truong Tuan Phat ,&nbsp;Ha Xuan Son ,&nbsp;Thai Thi Thanh Nhan ,&nbsp;Nguyen Ngoc Phien ,&nbsp;Trung Phan Hoang Tuan ,&nbsp;Ngan Nguyen Thi Kim","doi":"10.1016/j.engappai.2025.113681","DOIUrl":"10.1016/j.engappai.2025.113681","url":null,"abstract":"<div><div>Financial time-series forecasting is challenged by non-linear, non-stationary dynamics driven by macroeconomic factors, market sentiment, and stochastic events. Traditional statistical models assume stationarity and linear dependencies, failing to capture complex temporal patterns, while deep learning approaches struggle with vanishing gradients and long-term dependencies. Standard Transformers incur high computational costs (quadratic complexity, <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>n</mi></mrow><mrow><mn>2</mn></mrow></msup><mi>⋅</mi><mi>d</mi><mo>)</mo></mrow></mrow></math></span>, per layer) due to attention mechanisms and large parameter counts, where <span><math><mi>n</mi></math></span> is the sequence length and <span><math><mi>d</mi></math></span> is the model dimension. This study proposes LiteFormer, a lightweight, encoder-only Transformer for univariate stock price forecasting, leveraging <span><math><mrow><mi>N</mi><mo>=</mo><mn>4</mn></mrow></math></span> encoder layers with <span><math><mrow><mi>h</mi><mo>=</mo><mn>8</mn></mrow></math></span> multi-head self-attention and feed-forward networks (<span><math><mrow><msub><mrow><mi>d</mi></mrow><mrow><mtext>ff</mtext></mrow></msub><mo>=</mo><mn>512</mn></mrow></math></span>). Operating on sequences of closing prices (<span><math><mrow><mi>T</mi><mo>=</mo><mn>14</mn></mrow></math></span>, <span><math><mrow><msub><mrow><mi>d</mi></mrow><mrow><mtext>model</mtext></mrow></msub><mo>=</mo><mn>128</mn></mrow></math></span>), LiteFormer employs sinusoidal positional encodings, a causal mask, dropout (<span><math><mrow><mi>p</mi><mo>=</mo><mn>0</mn><mo>.</mo><mn>1</mn></mrow></math></span>), and layer normalization to model temporal dependencies and enhance generalization. With only 750,000+ parameters, LiteFormer reduces per layer complexity via compact design, thereby enabling low-latency inference (38 millisecond) and energy efficiency (96.894 Watt), which promises to offers scalable real-time inference for industrial fintech systems. Experiments across 30 stocks from the S&amp;P 500, FTSE 100, and Nikkei 225 indices demonstrate Mean Absolute Error and Root Mean Square Error reductions of 3.45%–9.09% over vanilla Transformers and up to 48% over recurrence neural models for high-volatility stocks. LiteFormer’s efficient, interpretable architecture, driven by attention weights, offers a scalable solution with potential for multivariate extensions and real-world multi-modal applications in predictive domain.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"167 ","pages":"Article 113681"},"PeriodicalIF":8.0,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145957667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Engineering Applications of Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1