M. Kathiravan, Ashwini A., Balasubramaniam S., T. D. Subha, Gururama Senthilvel P., Sivakumar T. A.
The pathological development of abnormal blood vessels results in neovascularization as a major vision-threatening condition of diabetic retinopathy. The main factor behind pathological vessel growth results from retinal capillary depletion of oxygen that causes abnormal vascular development patterns. Early detection of these fundus image abnormalities requires precision because it enables ophthalmologists to provide effective treatment and make proper diagnoses. A multiple-step image processing system treats this problem. A fusion-based contrast enhancement method begins the process of enhancing diabetic retinopathy fundus image brightness and contrast. After the initial process, the system applies detail weighted histogram equalization to the green channel for better structural detail visualization. In the second stage, the proposed online tiger-claw algorithm segments abnormal neovascularization from normal blood vessels. Next, the combination of fuzzy zone-based clustering with optimization and classifier thresholding performs local identification along with highlight generation for neovascularized areas. Neovascularization detection makes use of a YOLOv5 neural network in the third stage through feature extraction and classification operations. A refined segmentation process occurs with the application of multistage gray wolf optimization. The proposed algorithm underwent testing through its application to the public datasets STARE, DRIVE, MESSIDOR, and DIARETDB1. Experimental tests indicate that the neovascularization region marking performed with 98.19% sensitivity and 96.56% specificity while reaching 99.27% accuracy. The proposed approach demonstrates 97.03% accuracy and 98.94% sensitivity, together with 97.17% specificity in neovascularization detection.
{"title":"Optimization Enabled Online Tiger-Claw Fuzzy Region With Clustering Based Neovascularization Segmentation and Classification Using YOLO-V5 From Retinal Fundus Images","authors":"M. Kathiravan, Ashwini A., Balasubramaniam S., T. D. Subha, Gururama Senthilvel P., Sivakumar T. A.","doi":"10.1155/int/6119924","DOIUrl":"https://doi.org/10.1155/int/6119924","url":null,"abstract":"<p>The pathological development of abnormal blood vessels results in neovascularization as a major vision-threatening condition of diabetic retinopathy. The main factor behind pathological vessel growth results from retinal capillary depletion of oxygen that causes abnormal vascular development patterns. Early detection of these fundus image abnormalities requires precision because it enables ophthalmologists to provide effective treatment and make proper diagnoses. A multiple-step image processing system treats this problem. A fusion-based contrast enhancement method begins the process of enhancing diabetic retinopathy fundus image brightness and contrast. After the initial process, the system applies detail weighted histogram equalization to the green channel for better structural detail visualization. In the second stage, the proposed online tiger-claw algorithm segments abnormal neovascularization from normal blood vessels. Next, the combination of fuzzy zone-based clustering with optimization and classifier thresholding performs local identification along with highlight generation for neovascularized areas. Neovascularization detection makes use of a YOLOv5 neural network in the third stage through feature extraction and classification operations. A refined segmentation process occurs with the application of multistage gray wolf optimization. The proposed algorithm underwent testing through its application to the public datasets STARE, DRIVE, MESSIDOR, and DIARETDB1. Experimental tests indicate that the neovascularization region marking performed with 98.19% sensitivity and 96.56% specificity while reaching 99.27% accuracy. The proposed approach demonstrates 97.03% accuracy and 98.94% sensitivity, together with 97.17% specificity in neovascularization detection.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/6119924","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145572459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando Sevilla Martínez, Jordi Casas-Roma, Laia Subirats, Raúl Parada
As autonomous driving (AD) systems grow more complex, their rising computational demands pose significant energy and sustainability challenges. This paper investigates spiking neural networks (SNNs) as low-power alternatives to convolutional neural networks (CNNs) for regression tasks in AD. We introduce a membrane-potential (Vmem) decoding framework that converts binary spike trains into continuous outputs and propose the energy-to-error ratio (EER), a unified metric combining prediction error with energy consumption. Three CNN architectures (PilotNet, LaksNet, and MiniNet) and their corresponding SNN variants are trained and evaluated using delta, latency, and rate encoding across varied parameter settings, with energy use and emissions logged. Delta-encoded SNNs achieve the highest EER, substantial energy savings with minimal performance loss, whereas CNNs, despite slightly better MSE, incur 10–20 × higher energy costs. Rate encoding underperforms, and latency encoding, though improving relative error, demands excessive energy. Parameter tuning (threshold θ, temporal dynamics (S), membrane time constant (τ), and gain G) directly influences eco-efficiency. All experiments run on standard GPUs, showing SNNs can surpass CNNs in eco-efficiency without specialized hardware. Paired statistical tests confirm that only delta-encoded SNNs achieve significant EER improvements. This work presents a practical, energy-aware evaluation framework for neural architectures, establishing EER as a critical metric for sustainable machine learning in intelligent transport and beyond.
{"title":"Energy-Aware Regression in Spiking Neural Networks for Autonomous Driving: A Comparative Study With Convolutional Networks","authors":"Fernando Sevilla Martínez, Jordi Casas-Roma, Laia Subirats, Raúl Parada","doi":"10.1155/int/4879993","DOIUrl":"https://doi.org/10.1155/int/4879993","url":null,"abstract":"<p>As autonomous driving (AD) systems grow more complex, their rising computational demands pose significant energy and sustainability challenges. This paper investigates spiking neural networks (SNNs) as low-power alternatives to convolutional neural networks (CNNs) for regression tasks in AD. We introduce a membrane-potential (<i>V</i><sub>mem</sub>) decoding framework that converts binary spike trains into continuous outputs and propose the energy-to-error ratio (EER), a unified metric combining prediction error with energy consumption. Three CNN architectures (PilotNet, LaksNet, and MiniNet) and their corresponding SNN variants are trained and evaluated using delta, latency, and rate encoding across varied parameter settings, with energy use and emissions logged. Delta-encoded SNNs achieve the highest EER, substantial energy savings with minimal performance loss, whereas CNNs, despite slightly better MSE, incur 10–20 × higher energy costs. Rate encoding underperforms, and latency encoding, though improving relative error, demands excessive energy. Parameter tuning (threshold <i>θ</i>, temporal dynamics (<i>S</i>), membrane time constant (<i>τ</i>), and gain <i>G</i>) directly influences eco-efficiency. All experiments run on standard GPUs, showing SNNs can surpass CNNs in eco-efficiency without specialized hardware. Paired statistical tests confirm that only delta-encoded SNNs achieve significant EER improvements. This work presents a practical, energy-aware evaluation framework for neural architectures, establishing EER as a critical metric for sustainable machine learning in intelligent transport and beyond.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/4879993","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145522267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laiqiao Qin, Tianqing Zhu, Wanlei Zhou, Philip S. Yu
Federated learning (FL) is a distributed and privacy-preserving machine learning paradigm that coordinates multiple clients to train a model while keeping the raw data localized. However, this traditional FL poses some challenges, including privacy risks, data heterogeneity, communication bottlenecks, and system heterogeneity issues. To tackle these challenges, knowledge distillation (KD) has been widely applied in FL since 2020. KD is a validated and efficacious model compression and enhancement algorithm. The core concept of KD involves facilitating knowledge transfer between models by exchanging logits at intermediate or output layers. These properties make KD an excellent solution for the long-lasting challenges in FL. Up to now, there have been few reviews that summarize and analyze the current trend and methods for how KD can be applied in FL efficiently. This article aims to provide a comprehensive survey of KD-based FL, focusing on addressing the above challenges. First, we provide an overview of KD-based FL, including its motivation, basics, taxonomy, and a comparison with traditional FL and where KD should execute. We also analyze the critical factors in KD-based FL in the Appendix, including teachers, knowledge, data, and methods. We discuss how KD can address the challenges in FL, including privacy protection, data heterogeneity, communication efficiency, and personalization. Finally, we discuss the challenges facing KD-based FL algorithms and future research directions. We hope this survey can provide insights and guidance for researchers and practitioners in the FL area.
{"title":"Knowledge Distillation in Federated Learning: A Survey on Long Lasting Challenges and New Solutions","authors":"Laiqiao Qin, Tianqing Zhu, Wanlei Zhou, Philip S. Yu","doi":"10.1155/int/7406934","DOIUrl":"https://doi.org/10.1155/int/7406934","url":null,"abstract":"<p>Federated learning (FL) is a distributed and privacy-preserving machine learning paradigm that coordinates multiple clients to train a model while keeping the raw data localized. However, this traditional FL poses some challenges, including privacy risks, data heterogeneity, communication bottlenecks, and system heterogeneity issues. To tackle these challenges, knowledge distillation (KD) has been widely applied in FL since 2020. KD is a validated and efficacious model compression and enhancement algorithm. The core concept of KD involves facilitating knowledge transfer between models by exchanging logits at intermediate or output layers. These properties make KD an excellent solution for the long-lasting challenges in FL. Up to now, there have been few reviews that summarize and analyze the current trend and methods for how KD can be applied in FL efficiently. This article aims to provide a comprehensive survey of KD-based FL, focusing on addressing the above challenges. First, we provide an overview of KD-based FL, including its motivation, basics, taxonomy, and a comparison with traditional FL and where KD should execute. We also analyze the critical factors in KD-based FL in the Appendix, including teachers, knowledge, data, and methods. We discuss how KD can address the challenges in FL, including privacy protection, data heterogeneity, communication efficiency, and personalization. Finally, we discuss the challenges facing KD-based FL algorithms and future research directions. We hope this survey can provide insights and guidance for researchers and practitioners in the FL area.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/7406934","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Effective health monitoring of electrical equipment is critical for industrial reliability. Although infrared thermal imaging offers a powerful noncontact diagnostic method, accurately interpreting its complex and often noisy thermal patterns remains a significant challenge. Entropy-based analysis is well suited for quantifying this complexity, but its application to images has been limited. Existing two-dimensional entropy methods are not only less developed than their one-dimensional counterparts but also typically require converting thermal images to grayscale, which discards vital diagnostic information from color channels. To overcome these limitations, this study introduces the modified multiscale two-dimensional color distribution entropy (MMCDEn2D). This novel method directly integrates the attributes of the RGB, preserving a richer feature set for analysis. The effectiveness of the proposed method is demonstrated first through synthetic signals, showing low sensitivity to image size and high computational efficiency. The study further extends the application of entropy-based analysis to noncontact health monitoring scenarios, implementing MMCDEn2D for thermal image-based fault diagnosis of induction motors and power transformers. The method achieves a diagnostic accuracy that exceeds 95%, significantly outperforming traditional approaches. Crucially, it demonstrates superior robustness in challenging scenarios, improving accuracy by 2%–5% under high-noise conditions and with small sample sizes. These results establish MMCDEn2D as a highly effective and reliable tool to advance noncontact fault diagnosis in critical electrical equipment.
{"title":"Noncontact Fault Diagnosis of Electrical Equipment Using Modified Multiscale Two-Dimensional Color Distribution Entropy and Thermal Imaging","authors":"Shun Wang, Yolanda Vidal, Francesc Pozo","doi":"10.1155/int/4805844","DOIUrl":"https://doi.org/10.1155/int/4805844","url":null,"abstract":"<p>Effective health monitoring of electrical equipment is critical for industrial reliability. Although infrared thermal imaging offers a powerful noncontact diagnostic method, accurately interpreting its complex and often noisy thermal patterns remains a significant challenge. Entropy-based analysis is well suited for quantifying this complexity, but its application to images has been limited. Existing two-dimensional entropy methods are not only less developed than their one-dimensional counterparts but also typically require converting thermal images to grayscale, which discards vital diagnostic information from color channels. To overcome these limitations, this study introduces the modified multiscale two-dimensional color distribution entropy (MMCDEn<sub>2D</sub>). This novel method directly integrates the attributes of the RGB, preserving a richer feature set for analysis. The effectiveness of the proposed method is demonstrated first through synthetic signals, showing low sensitivity to image size and high computational efficiency. The study further extends the application of entropy-based analysis to noncontact health monitoring scenarios, implementing MMCDEn<sub>2D</sub> for thermal image-based fault diagnosis of induction motors and power transformers. The method achieves a diagnostic accuracy that exceeds 95%, significantly outperforming traditional approaches. Crucially, it demonstrates superior robustness in challenging scenarios, improving accuracy by 2%–5% under high-noise conditions and with small sample sizes. These results establish MMCDEn<sub>2D</sub> as a highly effective and reliable tool to advance noncontact fault diagnosis in critical electrical equipment.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/4805844","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traffic accident prediction serves as a cornerstone of intelligent transportation systems, enabling proactive city-wide control strategies and public safety interventions. Effective models must capture the evolving spatiotemporal propagation of risk while addressing heterogeneous data distributions across urban regions. Current approaches face significant limitations: fixed graph topologies fail to represent nonstationary accident patterns, while uniform task weighting leads to optimization bias toward data-rich areas, ultimately constraining adaptability in adjacency construction and multihop spatial reasoning. To address these challenges, we propose a dynamic multidiffusion graph network with multitask learning (DiffG-MTL) for city-scale accident prediction. Specifically, a dynamic diffusion adjacency generation (DDAG) module constructs time-varying, diffusion-based adjacency matrices through multiple propagation pathways. A multiscale graph structure learning (MGSL) module captures multihop spatial relationships and temporal cues, while effectively highlighting anomalous traffic behaviors. To alleviate regional data imbalance, we introduce a dynamic multitask learning objective that adaptively redistributes learning focus using recall-aware weighting and task-level normalization. Comprehensive evaluations on six widely used datasets demonstrate that DiffG-MTL consistently outperforms state-of-the-art baselines across multiple evaluation metrics. Additional experiments validate its robustness and effectiveness in modeling complex spatiotemporal accident patterns.
{"title":"DiffG-MTL: A Dynamic Multidiffusion Graph Network for Multitask Traffic Accident Prediction","authors":"Nana Bu, Zongtao Duan, Wen Dang","doi":"10.1155/int/8995422","DOIUrl":"https://doi.org/10.1155/int/8995422","url":null,"abstract":"<p>Traffic accident prediction serves as a cornerstone of intelligent transportation systems, enabling proactive city-wide control strategies and public safety interventions. Effective models must capture the evolving spatiotemporal propagation of risk while addressing heterogeneous data distributions across urban regions. Current approaches face significant limitations: fixed graph topologies fail to represent nonstationary accident patterns, while uniform task weighting leads to optimization bias toward data-rich areas, ultimately constraining adaptability in adjacency construction and multihop spatial reasoning. To address these challenges, we propose a dynamic multidiffusion graph network with multitask learning (DiffG-MTL) for city-scale accident prediction. Specifically, a dynamic diffusion adjacency generation (DDAG) module constructs time-varying, diffusion-based adjacency matrices through multiple propagation pathways. A multiscale graph structure learning (MGSL) module captures multihop spatial relationships and temporal cues, while effectively highlighting anomalous traffic behaviors. To alleviate regional data imbalance, we introduce a dynamic multitask learning objective that adaptively redistributes learning focus using recall-aware weighting and task-level normalization. Comprehensive evaluations on six widely used datasets demonstrate that DiffG-MTL consistently outperforms state-of-the-art baselines across multiple evaluation metrics. Additional experiments validate its robustness and effectiveness in modeling complex spatiotemporal accident patterns.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/8995422","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Like Ji, Fuxing Yang, Zicheng Xiong, Jun Qiu, Fang Zuo, Kefan Yi, Shengbo Chen, Wenying Chen, Kai Zhao, Ghulam Mohi-ud-din
Cerebral herniation is a life-threatening neurological emergency, where timely and accurate prediction is crucial for improving patient prognosis. Due to its rapid imaging advantages, CT becomes the preferred choice for cerebral herniation screening. With the continuous development of artificial intelligence technology in the field of neurological diseases, CT-based models provide significant support for computer-aided clinical diagnosis. However, current research on cerebral herniation diagnosis remains limited. Existing methods rely on traditional machine learning or focus solely on midline shift detection, which not only exhibits strong subjectivity but also neglects key structures such as the brainstem and the rich information from sagittal CT images. To address these limitations, this study focuses on mid-sagittal CT images including the brainstem and combines clinical data to construct a multimodal deep learning framework for cerebral herniation prediction. The model integrates mature and advanced deep learning architectures to extract and fuse features from CT images and clinical text data, employing multiscale convolution and attention mechanisms for diagnostic classification. The model is evaluated on datasets from two centers. Results show that on the internal test set, the model achieves accuracy, sensitivity, specificity, and AUC of 89%, 92%, 88%, and 0.94, respectively; on the external test set, it attains accuracy, sensitivity, specificity, and AUC of 81%, 82%, 80%, and 0.89, respectively, outperforming baseline methods and existing state-of-the-art approaches. Additionally, when compared with radiologists on the internal test set, the model’s performance matches or exceeds the consensus of physicians. We also reveal the model’s focus region through visual analysis, which further deepens the understanding of the model’s prediction process and enhances its interpretability. Experiments demonstrate that the proposed method holds significant potential in assisting cerebral herniation diagnosis.
{"title":"Multimodal Deep Learning for Predicting Cerebral Herniation Using Sagittal CT and Clinical Data","authors":"Like Ji, Fuxing Yang, Zicheng Xiong, Jun Qiu, Fang Zuo, Kefan Yi, Shengbo Chen, Wenying Chen, Kai Zhao, Ghulam Mohi-ud-din","doi":"10.1155/int/9369999","DOIUrl":"https://doi.org/10.1155/int/9369999","url":null,"abstract":"<p>Cerebral herniation is a life-threatening neurological emergency, where timely and accurate prediction is crucial for improving patient prognosis. Due to its rapid imaging advantages, CT becomes the preferred choice for cerebral herniation screening. With the continuous development of artificial intelligence technology in the field of neurological diseases, CT-based models provide significant support for computer-aided clinical diagnosis. However, current research on cerebral herniation diagnosis remains limited. Existing methods rely on traditional machine learning or focus solely on midline shift detection, which not only exhibits strong subjectivity but also neglects key structures such as the brainstem and the rich information from sagittal CT images. To address these limitations, this study focuses on mid-sagittal CT images including the brainstem and combines clinical data to construct a multimodal deep learning framework for cerebral herniation prediction. The model integrates mature and advanced deep learning architectures to extract and fuse features from CT images and clinical text data, employing multiscale convolution and attention mechanisms for diagnostic classification. The model is evaluated on datasets from two centers. Results show that on the internal test set, the model achieves accuracy, sensitivity, specificity, and AUC of 89%, 92%, 88%, and 0.94, respectively; on the external test set, it attains accuracy, sensitivity, specificity, and AUC of 81%, 82%, 80%, and 0.89, respectively, outperforming baseline methods and existing state-of-the-art approaches. Additionally, when compared with radiologists on the internal test set, the model’s performance matches or exceeds the consensus of physicians. We also reveal the model’s focus region through visual analysis, which further deepens the understanding of the model’s prediction process and enhances its interpretability. Experiments demonstrate that the proposed method holds significant potential in assisting cerebral herniation diagnosis.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/9369999","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This PRISMA 2020–compliant systematic review examines how intelligent agents, large language models (LLMs), and recurrent neural networks (RNNs) can be combined for industrial maintenance, with a sector-specific focus on mining. Scopus and Web of Science (2018–2025) were searched using replicable queries, and a dual text-representation pipeline (TF–IDF with bi/trigrams and sentence-transformer embeddings) was applied. Model selection scanned k over a predefined grid with internal indices (Silhouette, Davies–Bouldin, and Calinski–Harabasz), and robustness was assessed through multiseed stability, bootstrap consensus, representation-sensitivity checks, and a control run with HDBSCAN. Study quality and risk of bias were appraised with an AI-and-control–oriented matrix (ACE-QA). Two macroclusters emerged. The first centers on distributed control, consensus and formation, fault tolerance, observers, and learning-based designs (fuzzy/neural/RL), including finite/predefined-time and event/dynamic event–triggered mechanisms. The second addresses secure and resilient cooperation under cyber threats (DoS, deception, and FDIA), integrating observer-based estimation and communication-efficient protocols. Cross-cutting findings indicate that event-triggered updates reduce bandwidth and compute requirements, while robust estimation and fault-tolerant control improve availability under harsh conditions and intermittent networks—typical in mining. A maturity map suggests high technical readiness and growing adoption for RNN-based sensing analytics, advancing readiness but emerging adoption for multiagent coordination, and early adoption of LLMs for text-grounded maintenance intelligence. Evidence gaps persist in replicability, cross-site transfer, uncertainty reporting, and mining-grade validation at the edge. A design agenda is outlined that prioritizes digital-twin stress testing, edge-first evaluation of agent coordination, secure-by-design pipelines (authenticated/encrypted messaging and adversarial testing), and shift-aware validation. In sum, a hybrid stack—RNNs for perception, LLMs for knowledge grounding, and agents for coordinated action—offers a practical route to reliable, secure, and communication-efficient predictive maintenance in Mining 4.0.
这份符合PRISMA 2020标准的系统综述研究了智能代理、大型语言模型(llm)和循环神经网络(rnn)如何结合起来进行工业维护,并以特定行业为重点。使用可复制查询对Scopus和Web of Science(2018-2025)进行检索,并采用双文本表示管道(TF-IDF与bi/ triram和句子转换器嵌入)。模型选择扫描k在一个预定义的网格与内部指数(Silhouette, Davies-Bouldin和calinsky - harabasz),鲁棒性是通过多种子稳定性,引导共识,表示敏感性检查,并与HDBSCAN控制运行进行评估。采用人工智能和控制导向矩阵(ACE-QA)评价研究质量和偏倚风险。出现了两个宏观集群。第一个集中在分布式控制、共识和形成、容错、观察者和基于学习的设计(模糊/神经/强化学习),包括有限/预定义时间和事件/动态事件触发机制。第二部分涉及网络威胁(DoS、欺骗和FDIA)下的安全和弹性合作,整合基于观察者的估计和通信高效协议。横切研究结果表明,事件触发的更新减少了带宽和计算需求,而鲁棒估计和容错控制提高了恶劣条件和间歇性网络(采矿中典型的)下的可用性。成熟度图表明,基于rnn的传感分析技术成熟度高,采用度高,多智能体协调技术成熟度高,采用度低,基于文本的维护智能技术早期采用llm。证据差距持续存在于可复制性、跨站点转移、不确定性报告和边缘的采矿品位验证。概述了一个设计议程,优先考虑数字孪生压力测试、代理协调的边缘优先评估、设计安全管道(身份验证/加密消息传递和对抗性测试)和位移感知验证。总之,在Mining 4.0中,混合堆栈——用于感知的rnn,用于知识基础的llm,以及用于协调行动的代理——为可靠、安全和通信高效的预测性维护提供了一条实用的途径。
{"title":"A Systematic Review of Intelligent Agents, Language Models, and Recurrent Neural Networks in Industrial Maintenance: Driving Value Creation for the Mining Sector","authors":"Luis Rojas, Beatriz Hernandez, José Garcia","doi":"10.1155/int/9953223","DOIUrl":"https://doi.org/10.1155/int/9953223","url":null,"abstract":"<p>This PRISMA 2020–compliant systematic review examines how intelligent agents, large language models (LLMs), and recurrent neural networks (RNNs) can be combined for industrial maintenance, with a sector-specific focus on mining. Scopus and Web of Science (2018–2025) were searched using replicable queries, and a dual text-representation pipeline (TF–IDF with bi/trigrams and sentence-transformer embeddings) was applied. Model selection scanned <i>k</i> over a predefined grid with internal indices (Silhouette, Davies–Bouldin, and Calinski–Harabasz), and robustness was assessed through multiseed stability, bootstrap consensus, representation-sensitivity checks, and a control run with HDBSCAN. Study quality and risk of bias were appraised with an AI-and-control–oriented matrix (ACE-QA). Two macroclusters emerged. The first centers on distributed control, consensus and formation, fault tolerance, observers, and learning-based designs (fuzzy/neural/RL), including finite/predefined-time and event/dynamic event–triggered mechanisms. The second addresses secure and resilient cooperation under cyber threats (DoS, deception, and FDIA), integrating observer-based estimation and communication-efficient protocols. Cross-cutting findings indicate that event-triggered updates reduce bandwidth and compute requirements, while robust estimation and fault-tolerant control improve availability under harsh conditions and intermittent networks—typical in mining. A maturity map suggests high technical readiness and growing adoption for RNN-based sensing analytics, advancing readiness but emerging adoption for multiagent coordination, and early adoption of LLMs for text-grounded maintenance intelligence. Evidence gaps persist in replicability, cross-site transfer, uncertainty reporting, and mining-grade validation at the edge. A design agenda is outlined that prioritizes digital-twin stress testing, edge-first evaluation of agent coordination, secure-by-design pipelines (authenticated/encrypted messaging and adversarial testing), and shift-aware validation. In sum, a hybrid stack—RNNs for perception, LLMs for knowledge grounding, and agents for coordinated action—offers a practical route to reliable, secure, and communication-efficient predictive maintenance in Mining 4.0.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/9953223","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prediction of student performance is crucial for enhancing the quality of higher education worldwide by enabling timely interventions and personalized support. The blended learning environment, which integrates online and offline instruction, has become a predominant paradigm, yet it also brings significant challenges for the prediction of student performance due to the inherent complexity and heterogeneity of multimodal data generated across both environments. Specifically, existing approaches often fail to effectively leverage the synergistic potential of numerical behavioral traces and unstructured textual feedback from instructors. To address this problem, this study proposes a unified deep learning framework that integrates convolutional neural networks (CNN) and bidirectional long short-term memory (BiLSTM) to capture both temporal dynamics and spatial correlations among features in blended learning environment. Unlike previous studies, we incorporate teacher comments on student assignments as textual inputs alongside traditional numerical features. Extensive experiments based on a real-world dataset of approximately 14,878 student records from Shandong University of Finance and Economics (SDUFE) demonstrate that the proposed hybrid model outperforms existing approaches in predictive performance. The results also highlight the significant value of leveraging teacher feedback for improving prediction accuracy, offering practical insights for enhancing educational management and supporting student success in blended learning contexts in higher education.
{"title":"A Unified Deep Learning Framework for Student Performance Prediction With Multimodal Data in a Blended Learning Environment","authors":"Wu Xiuguo","doi":"10.1155/int/7978546","DOIUrl":"https://doi.org/10.1155/int/7978546","url":null,"abstract":"<p>Prediction of student performance is crucial for enhancing the quality of higher education worldwide by enabling timely interventions and personalized support. The blended learning environment, which integrates online and offline instruction, has become a predominant paradigm, yet it also brings significant challenges for the prediction of student performance due to the inherent complexity and heterogeneity of multimodal data generated across both environments. Specifically, existing approaches often fail to effectively leverage the synergistic potential of numerical behavioral traces and unstructured textual feedback from instructors. To address this problem, this study proposes a unified deep learning framework that integrates convolutional neural networks (CNN) and bidirectional long short-term memory (BiLSTM) to capture both temporal dynamics and spatial correlations among features in blended learning environment. Unlike previous studies, we incorporate teacher comments on student assignments as textual inputs alongside traditional numerical features. Extensive experiments based on a real-world dataset of approximately 14,878 student records from Shandong University of Finance and Economics (SDUFE) demonstrate that the proposed hybrid model outperforms existing approaches in predictive performance. The results also highlight the significant value of leveraging teacher feedback for improving prediction accuracy, offering practical insights for enhancing educational management and supporting student success in blended learning contexts in higher education.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/7978546","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145521680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heterogeneous multicore processor systems are commonly used for scheduling tasks of DAG applications. Deep reinforcement learning, with its superior ability to perceive decisions directly and handle high-dimensional state actions, has become a prevalent solution for scheduling these systems. However, the incomplete environment models and large action spaces of deep reinforcement learning present significant challenges to scheduling. This paper investigates a scheduling problem in a heterogeneous multicore processor environment. Initially, system environment information is extracted and encoded using a graph convolutional neural network based on integrating adapter and AdapterFusion into the transformer architecture. Then, by separating task selection and processor allocation, the decision space is reduced: the former uses a deep neural network to learn to select nodes, and the latter allocates processors using a heuristic scheduling algorithm combining earliest completion time-based node replication and rolling technology. The entire scheduling process is a Markov decision problem. Therefore, the PPO algorithm with dynamic adjustment of the clipping factor, combined with an advantage actor-critic network, is employed for training, optimizing, and evaluating the algorithm to find the optimal scheduling strategy. The training process adopts a reward function for the time and power consumption required for completed task scheduling to ensure that multiple DAG application task scheduling can achieve optimal performance. Experiments conducted in various environments with different parameters show that, compared to other algorithms, this algorithm reduces the overall execution time and power consumption cost of heterogeneous multicore processor tasks by 11.09%.
{"title":"Task Scheduling for Heterogeneous Multi-Core Processors Based on Deep Reinforcement Learning","authors":"Qiguang Tan, Wei Chen, Dake Liu","doi":"10.1155/int/7562400","DOIUrl":"https://doi.org/10.1155/int/7562400","url":null,"abstract":"<p>Heterogeneous multicore processor systems are commonly used for scheduling tasks of DAG applications. Deep reinforcement learning, with its superior ability to perceive decisions directly and handle high-dimensional state actions, has become a prevalent solution for scheduling these systems. However, the incomplete environment models and large action spaces of deep reinforcement learning present significant challenges to scheduling. This paper investigates a scheduling problem in a heterogeneous multicore processor environment. Initially, system environment information is extracted and encoded using a graph convolutional neural network based on integrating adapter and AdapterFusion into the transformer architecture. Then, by separating task selection and processor allocation, the decision space is reduced: the former uses a deep neural network to learn to select nodes, and the latter allocates processors using a heuristic scheduling algorithm combining earliest completion time-based node replication and rolling technology. The entire scheduling process is a Markov decision problem. Therefore, the PPO algorithm with dynamic adjustment of the clipping factor, combined with an advantage actor-critic network, is employed for training, optimizing, and evaluating the algorithm to find the optimal scheduling strategy. The training process adopts a reward function for the time and power consumption required for completed task scheduling to ensure that multiple DAG application task scheduling can achieve optimal performance. Experiments conducted in various environments with different parameters show that, compared to other algorithms, this algorithm reduces the overall execution time and power consumption cost of heterogeneous multicore processor tasks by 11.09%.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/7562400","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The dynamic visual servoing problem studied in this paper differs from existing approaches in two key aspects: the dynamics of the aerial mobile robot are underactuated, and the onboard camera is adaptively calibrated. To address the first challenge, a novel cascade visual servoing framework is developed, consisting of three control loops: the image loop, the attitude loop, and the angular velocity loop. Based on this framework, an extended eye-in-hand vision system is constructed, in which the perspective projection of feature points onto the image plane is decoupled from the rigid body’s attitude. This design allows the proposed visual controller to effectively compensate for image dynamics. Furthermore, unknown intrinsic and extrinsic camera parameters make compensation for image dynamics more difficult. To overcome this issue, a depth-independent composite matrix is introduced, enabling the unknown visual dynamics to be linearly parameterized and integrated with an adaptive control technique. A novel online algorithm is developed to estimate the unknown camera parameters in real time, and an additional adaptation mechanism is incorporated to estimate the rotational inertia of the rigid body. Using Lyapunov theory and Barbalat’s lemma, it is proven that the image tracking error asymptotically converges to zero while all physical variables remain locally bounded. Experimental results confirm that the image tracking error converges to zero over time, with a maximum deviation of no more than two pixels, thereby validating the effectiveness of the proposed visual controller.
{"title":"Underactuated Dynamic Visual Servoing of Aerial Mobile Robots Using Adaptive Calibration of Camera","authors":"Yi Lyu, Aoqi Liu, Zhengfei Wen, Guanyu Lai, Weijun Yang, Qiangqiang Dong","doi":"10.1155/int/1464484","DOIUrl":"https://doi.org/10.1155/int/1464484","url":null,"abstract":"<p>The dynamic visual servoing problem studied in this paper differs from existing approaches in two key aspects: the dynamics of the aerial mobile robot are underactuated, and the onboard camera is adaptively calibrated. To address the first challenge, a novel cascade visual servoing framework is developed, consisting of three control loops: the image loop, the attitude loop, and the angular velocity loop. Based on this framework, an extended eye-in-hand vision system is constructed, in which the perspective projection of feature points onto the image plane is decoupled from the rigid body’s attitude. This design allows the proposed visual controller to effectively compensate for image dynamics. Furthermore, unknown intrinsic and extrinsic camera parameters make compensation for image dynamics more difficult. To overcome this issue, a depth-independent composite matrix is introduced, enabling the unknown visual dynamics to be linearly parameterized and integrated with an adaptive control technique. A novel online algorithm is developed to estimate the unknown camera parameters in real time, and an additional adaptation mechanism is incorporated to estimate the rotational inertia of the rigid body. Using Lyapunov theory and Barbalat’s lemma, it is proven that the image tracking error asymptotically converges to zero while all physical variables remain locally bounded. Experimental results confirm that the image tracking error converges to zero over time, with a maximum deviation of no more than two pixels, thereby validating the effectiveness of the proposed visual controller.</p>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2025 1","pages":""},"PeriodicalIF":3.7,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/int/1464484","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145469761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}