首页 > 最新文献

Intelligent Systems with Applications最新文献

英文 中文
Visual question answering for medical diagnosis 用于医学诊断的视觉问答
Pub Date : 2025-07-09 DOI: 10.1016/j.iswa.2025.200545
Nawel Ben Chaabane, Mohamed Bal-Ghaoui
The use of Artificial Intelligence (AI) in medical diagnosis is a breakthrough in healthcare, improving both accuracy and efficiency. Recently, a significant advancement has been made toward the development of multimodal AI systems that can process and integrate multiple types of data or modalities. This ability is key for interpreting medical images, such as X-rays, CT, and MRI scans, as well as textual data like electronic health records (EHRs) and clinical notes. In this era, Visual Question Answering (VQA) systems have demonstrated a potential use case in the medical domain. These systems, typically based on Vision-Language Models (VLMs), can answer natural lan- guage questions based on medical images, offering precise and relevant re- sponses that help doctors make better decisions.
In this article, we evaluate existing medical VQA models along with general and trending ones to make medical diagnoses. In particular, we focus on addressing abnormality questions considered challenging in the literature. Our approach consists of evaluating the Zero-Shot (ZS) general and domain-specific capabilities of different models using two created datasets, and fine-tuning the best-found models on the training set of the abnormality dataset before evaluating their performances quantitatively and qualitatively. IdeficMed, a generative domain-specific model, achieved better consistency and VQA outcomes by only training 0.22 % of its parameters. Additionally, we employed uncertainty quantification techniques (e.g., Monte Carlo dropout) to assess the confidence of the fine-tuned models in their predictions. We also conducted a sensitivity analysis on input perturbations, such as image noise and ambiguous questions.
人工智能(AI)在医疗诊断中的应用是医疗保健领域的一个突破,提高了准确性和效率。最近,多模态人工智能系统的发展取得了重大进展,可以处理和集成多种类型的数据或模式。这种能力对于解释医学图像(如x射线、CT和MRI扫描)以及文本数据(如电子健康记录(EHRs)和临床记录)至关重要。在这个时代,可视化问答(VQA)系统已经在医疗领域展示了一个潜在的用例。这些系统通常基于视觉语言模型(VLMs),可以回答基于医学图像的自然语言问题,提供精确和相关的回答,帮助医生做出更好的决定。在本文中,我们评估了现有的医学VQA模型以及一般和趋势模型,以进行医学诊断。特别是,我们专注于解决异常问题认为具有挑战性的文献。我们的方法包括使用两个创建的数据集评估不同模型的Zero-Shot (ZS)一般和特定领域的能力,并在对异常数据集的训练集进行微调之前,对其性能进行定量和定性评估。IdeficMed是一个生成领域特定模型,仅训练0.22%的参数就获得了更好的一致性和VQA结果。此外,我们采用不确定性量化技术(例如,蒙特卡罗退出)来评估精细调整模型在其预测中的置信度。我们还对输入扰动(如图像噪声和模糊问题)进行了灵敏度分析。
{"title":"Visual question answering for medical diagnosis","authors":"Nawel Ben Chaabane,&nbsp;Mohamed Bal-Ghaoui","doi":"10.1016/j.iswa.2025.200545","DOIUrl":"10.1016/j.iswa.2025.200545","url":null,"abstract":"<div><div>The use of Artificial Intelligence (AI) in medical diagnosis is a breakthrough in healthcare, improving both accuracy and efficiency. Recently, a significant advancement has been made toward the development of multimodal AI systems that can process and integrate multiple types of data or modalities. This ability is key for interpreting medical images, such as X-rays, CT, and MRI scans, as well as textual data like electronic health records (EHRs) and clinical notes. In this era, Visual Question Answering (VQA) systems have demonstrated a potential use case in the medical domain. These systems, typically based on Vision-Language Models (VLMs), can answer natural lan- guage questions based on medical images, offering precise and relevant re- sponses that help doctors make better decisions.</div><div>In this article, we evaluate existing medical VQA models along with general and trending ones to make medical diagnoses. In particular, we focus on addressing abnormality questions considered challenging in the literature. Our approach consists of evaluating the Zero-Shot (ZS) general and domain-specific capabilities of different models using two created datasets, and fine-tuning the best-found models on the training set of the abnormality dataset before evaluating their performances quantitatively and qualitatively. IdeficMed, a generative domain-specific model, achieved better consistency and VQA outcomes by only training 0.22 % of its parameters. Additionally, we employed uncertainty quantification techniques (e.g., Monte Carlo dropout) to assess the confidence of the fine-tuned models in their predictions. We also conducted a sensitivity analysis on input perturbations, such as image noise and ambiguous questions.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200545"},"PeriodicalIF":0.0,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144703262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On adversarial attack detection in the artificial intelligence era: Fundamentals, a taxonomy, and a review 关于人工智能时代的对抗性攻击检测:基础,分类和回顾
Pub Date : 2025-07-07 DOI: 10.1016/j.iswa.2025.200554
Noora Al Roken , Hakim Hacid , Ahmed Bouridane , Abir Hussain
The rapid advancement and sophisticated deployment of artificial intelligence tools by malicious actors have led to the rise of highly complex cyber-attacks that evolve quickly. This rapid evolution has made traditional defense systems increasingly ineffective at detecting and mitigating these hidden threats. Adversarial attacks are a prime example of such sophisticated cyber-attacks; they subtly alter attack patterns to evade detection by intelligent systems while still maintaining their harmful functionality. This paper provides a comprehensive overview of computer malware, examining both traditional concealment methods and more advanced adversarial techniques. It includes an in-depth analysis of recent research efforts aimed at detecting previously unseen adversarial attacks using both traditional and AI-driven approaches. Furthermore, this study discusses the limitations of current network intrusion detection systems and proposes directions for future research.
恶意行为者对人工智能工具的快速发展和复杂部署导致了高度复杂的网络攻击的兴起,这些攻击迅速演变。这种快速发展使得传统的防御系统在检测和减轻这些隐藏的威胁方面越来越无效。对抗性攻击是这种复杂的网络攻击的一个主要例子;它们巧妙地改变攻击模式,以逃避智能系统的检测,同时仍保持其有害的功能。本文提供了计算机恶意软件的全面概述,检查了传统的隐藏方法和更先进的对抗技术。它包括对最近的研究工作的深入分析,旨在使用传统和人工智能驱动的方法检测以前未见过的对抗性攻击。此外,本文还讨论了当前网络入侵检测系统的局限性,并提出了未来的研究方向。
{"title":"On adversarial attack detection in the artificial intelligence era: Fundamentals, a taxonomy, and a review","authors":"Noora Al Roken ,&nbsp;Hakim Hacid ,&nbsp;Ahmed Bouridane ,&nbsp;Abir Hussain","doi":"10.1016/j.iswa.2025.200554","DOIUrl":"10.1016/j.iswa.2025.200554","url":null,"abstract":"<div><div>The rapid advancement and sophisticated deployment of artificial intelligence tools by malicious actors have led to the rise of highly complex cyber-attacks that evolve quickly. This rapid evolution has made traditional defense systems increasingly ineffective at detecting and mitigating these hidden threats. Adversarial attacks are a prime example of such sophisticated cyber-attacks; they subtly alter attack patterns to evade detection by intelligent systems while still maintaining their harmful functionality. This paper provides a comprehensive overview of computer malware, examining both traditional concealment methods and more advanced adversarial techniques. It includes an in-depth analysis of recent research efforts aimed at detecting previously unseen adversarial attacks using both traditional and AI-driven approaches. Furthermore, this study discusses the limitations of current network intrusion detection systems and proposes directions for future research.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200554"},"PeriodicalIF":0.0,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144580839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cancelable random masking with deep learning for secure and interpretable finger vein authentication 可取消的随机掩蔽与深度学习的安全和可解释的手指静脉认证
Pub Date : 2025-07-07 DOI: 10.1016/j.iswa.2025.200552
Mohamed Hammad , Abdelhamied A. Ateya , Mohammed ElAffendi , Ahmed A. Abd El-Latif
In the area of identity verification and authentication, biometrics has emerged as a reliable means of recognizing individuals based on their unique behavioral or physical characteristics. Finger vein authentication, with its robustness, resistance to spoofing, and stable patterns, has gained significant attention as a biometric modality. This paper introduces a novel framework that integrates Cancelable Random Masking (CRM) with a lightweight deep learning model for secure and interpretable finger vein authentication. The CRM technique transforms biometric templates using cryptographic random masks, ensuring cancelability, revocability, and privacy. These transformed templates are then processed by a convolutional neural network (CNN) designed to learn discriminative features directly from masked inputs without relying on handcrafted feature extraction. Our method enhances transparency by making the transformation process interpretable and provides strong security against template inversion and adversarial attacks. Results conducted on three publicly available databases demonstrate the proposed framework’s superior performance in terms of accuracy, robustness, and resistance to spoofing and replay attacks. This is the first framework to integrate CRM within a deep learning model, satisfying all cancelable biometric criteria while enabling real-time, interpretable, and secure finger vein authentication.
在身份验证和认证领域,生物识别技术已经成为一种可靠的手段,可以根据个人独特的行为或身体特征来识别个人。手指静脉身份验证以其鲁棒性、抗欺骗性和稳定的模式,作为一种生物识别方式受到了极大的关注。本文介绍了一种新的框架,该框架将可取消随机掩蔽(CRM)与轻量级深度学习模型集成在一起,用于安全且可解释的手指静脉认证。CRM技术使用加密随机掩码转换生物识别模板,确保可取消性、可撤销性和隐私性。然后由卷积神经网络(CNN)处理这些转换后的模板,该网络设计用于直接从屏蔽输入中学习判别特征,而不依赖于手工特征提取。我们的方法通过使转换过程可解释来增强透明度,并提供针对模板反转和对抗性攻击的强大安全性。在三个公开可用的数据库上进行的结果表明,所提出的框架在准确性、鲁棒性以及对欺骗和重放攻击的抵抗力方面具有优越的性能。这是第一个将CRM集成到深度学习模型中的框架,满足所有可取消的生物识别标准,同时实现实时、可解释和安全的手指静脉身份验证。
{"title":"Cancelable random masking with deep learning for secure and interpretable finger vein authentication","authors":"Mohamed Hammad ,&nbsp;Abdelhamied A. Ateya ,&nbsp;Mohammed ElAffendi ,&nbsp;Ahmed A. Abd El-Latif","doi":"10.1016/j.iswa.2025.200552","DOIUrl":"10.1016/j.iswa.2025.200552","url":null,"abstract":"<div><div>In the area of identity verification and authentication, biometrics has emerged as a reliable means of recognizing individuals based on their unique behavioral or physical characteristics. Finger vein authentication, with its robustness, resistance to spoofing, and stable patterns, has gained significant attention as a biometric modality. This paper introduces a novel framework that integrates Cancelable Random Masking (CRM) with a lightweight deep learning model for secure and interpretable finger vein authentication. The CRM technique transforms biometric templates using cryptographic random masks, ensuring cancelability, revocability, and privacy. These transformed templates are then processed by a convolutional neural network (CNN) designed to learn discriminative features directly from masked inputs without relying on handcrafted feature extraction. Our method enhances transparency by making the transformation process interpretable and provides strong security against template inversion and adversarial attacks. Results conducted on three publicly available databases demonstrate the proposed framework’s superior performance in terms of accuracy, robustness, and resistance to spoofing and replay attacks. This is the first framework to integrate CRM within a deep learning model, satisfying all cancelable biometric criteria while enabling real-time, interpretable, and secure finger vein authentication.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200552"},"PeriodicalIF":0.0,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144580838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Certified Accuracy and Robustness: How different architectures stand up to adversarial attacks 认证的准确性和健壮性:不同的架构如何经受对抗性攻击
Pub Date : 2025-07-07 DOI: 10.1016/j.iswa.2025.200555
Azryl Elmy Sarih , Nagender Aneja , Ong Wee Hong
Adversarial attacks are a concern for image classification using neural networks. Numerous methods have been created to minimize the effects of attacks, where the best defense against such attacks is through adversarial training, which has proven to be the most successful to date. Due to the nature of adversarial attacks, it is difficult to assess the capabilities of a network to defend. The standard method of assessing a network’s performance in supervised image classification tasks is based on accuracy. However, this assessment method, while still important, is insufficient when adversarial attacks are included. A new metric called certified accuracy is used to assess network performance when samples are perturbed by adversarial noise. This paper supplements certified accuracy with an abstention rate to give more insight into the network’s robustness. Abstention rate measures the percentage of the network that failed to keep its prediction unchanged as the perturbation strength increases from zero to specified strength. The study focuses on popular and good-performing CNN-based architectures, specifically EfficientNet-B7, ResNet-50, ResNet-101, Wide-ResNet-101, and transformer architectures such as CaiT and ViT-B/16. The selected architectures are trained in adversarial and standard methods and then certified on CIFAR-10 datasets perturbed with Gaussian noises of different strengths. Our results show that transformers are more resilient to adversarial attacks than CNN-based architectures by a significant margin. Transformers exhibit better certified accuracy and tolerance against stronger noises than CNN-based architectures, demonstrating good robustness with and without adversarial training. The width and depth of a network have little effect on achieving robustness against adversarial attacks, but rather, the techniques that are deployed in the network are more impactful, where attention mechanisms have been shown to improve a network’s robustness.
对抗性攻击是使用神经网络进行图像分类的一个问题。为了尽量减少攻击的影响,已经创建了许多方法,其中针对此类攻击的最佳防御是通过对抗性训练,这已被证明是迄今为止最成功的。由于对抗性攻击的性质,很难评估网络的防御能力。评估网络在监督图像分类任务中的性能的标准方法是基于准确性。然而,这种评估方法虽然仍然很重要,但在包括对抗性攻击时就不够了。当样本受到对抗性噪声干扰时,使用了一种称为认证精度的新度量来评估网络性能。本文用弃权率补充认证精度,以更深入地了解网络的鲁棒性。弃权率衡量的是当扰动强度从零增加到指定强度时,网络未能保持其预测不变的百分比。该研究的重点是流行的和性能良好的基于cnn的架构,特别是高效网- b7、ResNet-50、ResNet-101、Wide-ResNet-101,以及CaiT和ViT-B/16等变压器架构。采用对抗性和标准方法对所选架构进行训练,然后在受不同强度高斯噪声干扰的CIFAR-10数据集上进行认证。我们的研究结果表明,与基于cnn的架构相比,变压器对对抗性攻击的弹性更强。与基于cnn的架构相比,变压器具有更好的认证精度和对更强噪声的容忍度,在有无对抗性训练的情况下都表现出良好的鲁棒性。网络的宽度和深度对实现对对抗性攻击的鲁棒性几乎没有影响,相反,在网络中部署的技术更有影响力,其中注意力机制已被证明可以提高网络的鲁棒性。
{"title":"Certified Accuracy and Robustness: How different architectures stand up to adversarial attacks","authors":"Azryl Elmy Sarih ,&nbsp;Nagender Aneja ,&nbsp;Ong Wee Hong","doi":"10.1016/j.iswa.2025.200555","DOIUrl":"10.1016/j.iswa.2025.200555","url":null,"abstract":"<div><div>Adversarial attacks are a concern for image classification using neural networks. Numerous methods have been created to minimize the effects of attacks, where the best defense against such attacks is through adversarial training, which has proven to be the most successful to date. Due to the nature of adversarial attacks, it is difficult to assess the capabilities of a network to defend. The standard method of assessing a network’s performance in supervised image classification tasks is based on accuracy. However, this assessment method, while still important, is insufficient when adversarial attacks are included. A new metric called certified accuracy is used to assess network performance when samples are perturbed by adversarial noise. This paper supplements certified accuracy with an abstention rate to give more insight into the network’s robustness. Abstention rate measures the percentage of the network that failed to keep its prediction unchanged as the perturbation strength increases from zero to specified strength. The study focuses on popular and good-performing CNN-based architectures, specifically EfficientNet-B7, ResNet-50, ResNet-101, Wide-ResNet-101, and transformer architectures such as CaiT and ViT-B/16. The selected architectures are trained in adversarial and standard methods and then certified on CIFAR-10 datasets perturbed with Gaussian noises of different strengths. Our results show that transformers are more resilient to adversarial attacks than CNN-based architectures by a significant margin. Transformers exhibit better certified accuracy and tolerance against stronger noises than CNN-based architectures, demonstrating good robustness with and without adversarial training. The width and depth of a network have little effect on achieving robustness against adversarial attacks, but rather, the techniques that are deployed in the network are more impactful, where attention mechanisms have been shown to improve a network’s robustness.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200555"},"PeriodicalIF":0.0,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cognitive map formation under uncertainty via local prediction learning 不确定条件下基于局部预测学习的认知地图形成
Pub Date : 2025-07-07 DOI: 10.1016/j.iswa.2025.200551
Calvin Yeung , Zhuowen Zou , Nathaniel D. Bastian , Mohsen Imani
Cognitive maps are internal world models that enable adaptive behaviour including spatial navigation and planning. The Cognitive Map Learner (CML) has been recently proposed as a model for cognitive map formation and planning. A CML learns high dimensional state and action representations using local prediction learning. While the CML offers a simple and elegant solution to cognitive map learning, it is limited by its simplicity, applying only to fully observable environments. To address this, we introduce the Partially Observable Cognitive Map Learner (POCML), extending the CML to handle partially observable environments.
The POCML uses a superposition of states represented via random Fourier features for probabilistic representation and uses the binding operation for parallel state updates. It features an associative memory to enable adaptive behaviour across environments with similar structures. We derive local update rules based on the POCML’s probabilistic state representation and associative memory. We show that a POCML is capable of learning the underlying structure of an environment via local next-observation prediction learning. In addition, we show that a POCML trained on an environment is capable of generalizing to environments with the same underlying structure but with novel observations, achieving good zero-shot next-observation prediction accuracy, significantly outperforming sequence models such as LSTMs and transformers. Finally, we present a case study of navigation in a two-tunnel maze environment with aliased observations, showing that a POCML is capable of effectively using its probabilistic state representations for disambiguation of states and spatial navigation.
认知地图是一种内部世界模型,能够实现空间导航和规划等适应性行为。认知地图学习者(Cognitive Map Learner, CML)最近被提出作为认知地图形成和规划的模型。CML使用局部预测学习来学习高维状态和动作表示。虽然CML为认知地图学习提供了一个简单而优雅的解决方案,但它受到其简单性的限制,仅适用于完全可观察的环境。为了解决这个问题,我们引入了部分可观察认知地图学习者(POCML),扩展了CML来处理部分可观察环境。POCML使用通过随机傅立叶特征表示的状态叠加进行概率表示,并使用绑定操作进行并行状态更新。它具有联想记忆功能,可以在具有相似结构的环境中实现自适应行为。我们基于POCML的概率状态表示和联想记忆导出了局部更新规则。我们证明了POCML能够通过局部下一个观测预测学习来学习环境的底层结构。此外,我们表明,在环境上训练的POCML能够推广到具有相同底层结构但具有新观测值的环境,实现良好的零射击下一观测预测精度,显著优于lstm和变压器等序列模型。最后,我们给出了一个带有混叠观测的双隧道迷宫环境中的导航案例研究,表明POCML能够有效地利用其概率状态表示来消除状态歧义和空间导航。
{"title":"Cognitive map formation under uncertainty via local prediction learning","authors":"Calvin Yeung ,&nbsp;Zhuowen Zou ,&nbsp;Nathaniel D. Bastian ,&nbsp;Mohsen Imani","doi":"10.1016/j.iswa.2025.200551","DOIUrl":"10.1016/j.iswa.2025.200551","url":null,"abstract":"<div><div>Cognitive maps are internal world models that enable adaptive behaviour including spatial navigation and planning. The Cognitive Map Learner (CML) has been recently proposed as a model for cognitive map formation and planning. A CML learns high dimensional state and action representations using local prediction learning. While the CML offers a simple and elegant solution to cognitive map learning, it is limited by its simplicity, applying only to fully observable environments. To address this, we introduce the Partially Observable Cognitive Map Learner (POCML), extending the CML to handle partially observable environments.</div><div>The POCML uses a superposition of states represented via random Fourier features for probabilistic representation and uses the binding operation for parallel state updates. It features an associative memory to enable adaptive behaviour across environments with similar structures. We derive local update rules based on the POCML’s probabilistic state representation and associative memory. We show that a POCML is capable of learning the underlying structure of an environment via local next-observation prediction learning. In addition, we show that a POCML trained on an environment is capable of generalizing to environments with the same underlying structure but with novel observations, achieving good zero-shot next-observation prediction accuracy, significantly outperforming sequence models such as LSTMs and transformers. Finally, we present a case study of navigation in a two-tunnel maze environment with aliased observations, showing that a POCML is capable of effectively using its probabilistic state representations for disambiguation of states and spatial navigation.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200551"},"PeriodicalIF":0.0,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144572151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and development of a dexterous soft-robotics based assistive exoglove with kinematic modeling 基于柔性机器人的辅助手套的设计与开发
Pub Date : 2025-07-04 DOI: 10.1016/j.iswa.2025.200550
Nawara Mahmood Broti , Shamim Ahmed Deowan , A.S.M. Shamsul Arefin
Necessity of a fully functional hand in our life is beyond description. Yet, a portion of the population is unable to move and control their hand due to paralysis. An assistive device can aid both daily activities and rehabilitation. This paper presents a dexterous soft robotics-based assistive glove with spatial kinematic model and control system. Unlike existing designs, our proposed five-fingered glove provides 20 degrees of freedom (DoFs), closely resembling a human hand. Each finger has 4 DoFs with controlled flexion, extension, abduction, and adduction motion ability. The tendon-driven mechanism simplifies design and control, while 3D-printed thermoplastic polyurethane (TPU) material ensures comfort, lightness, and an anthropomorphic appearance. The derived forward and inverse kinematics of each finger are capable of mapping joint angles to fingertip positions and orientations. To validate the kinematic model, virtual simulation was conducted to confirm its accuracy; while basic hand functionality experiments proved the gloves’ effectiveness. We expect this research to contribute to medical robotics, biomechanics, and assistive technology.
在我们的生活中,一只功能齐全的手的必要性是无法形容的。然而,由于瘫痪,一部分人无法移动和控制他们的手。辅助装置可以帮助日常活动和康复。提出了一种基于柔性机器人的灵巧辅助手套,并建立了空间运动学模型和控制系统。与现有的设计不同,我们提出的五指手套提供了20个自由度(DoFs),与人手非常相似。每根手指有4个自由度,可控制屈伸外展内收运动能力。肌腱驱动的机构简化了设计和控制,而3d打印的热塑性聚氨酯(TPU)材料确保了舒适性、轻盈性和拟人化的外观。导出的每个手指的正运动学和逆运动学能够映射关节角度到指尖的位置和方向。为验证运动学模型的正确性,对其进行了虚拟仿真;而基本的手部功能实验证明了手套的有效性。我们期望这项研究对医疗机器人、生物力学和辅助技术做出贡献。
{"title":"Design and development of a dexterous soft-robotics based assistive exoglove with kinematic modeling","authors":"Nawara Mahmood Broti ,&nbsp;Shamim Ahmed Deowan ,&nbsp;A.S.M. Shamsul Arefin","doi":"10.1016/j.iswa.2025.200550","DOIUrl":"10.1016/j.iswa.2025.200550","url":null,"abstract":"<div><div>Necessity of a fully functional hand in our life is beyond description. Yet, a portion of the population is unable to move and control their hand due to paralysis. An assistive device can aid both daily activities and rehabilitation. This paper presents a dexterous soft robotics-based assistive glove with spatial kinematic model and control system. Unlike existing designs, our proposed five-fingered glove provides 20 degrees of freedom (DoFs), closely resembling a human hand. Each finger has 4 DoFs with controlled flexion, extension, abduction, and adduction motion ability. The tendon-driven mechanism simplifies design and control, while 3D-printed thermoplastic polyurethane (TPU) material ensures comfort, lightness, and an anthropomorphic appearance. The derived forward and inverse kinematics of each finger are capable of mapping joint angles to fingertip positions and orientations. To validate the kinematic model, virtual simulation was conducted to confirm its accuracy; while basic hand functionality experiments proved the gloves’ effectiveness. We expect this research to contribute to medical robotics, biomechanics, and assistive technology.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200550"},"PeriodicalIF":0.0,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Role and Applications of Semantic Interoperability Tools and eXplainable AI in the Development of Smart Food Systems: Findings from a Systematic Literature Review 语义互操作性工具和可解释的人工智能在智能食品系统发展中的作用和应用:来自系统文献综述的发现
Pub Date : 2025-06-27 DOI: 10.1016/j.iswa.2025.200547
Donika Xhani, Gayane Sedrakyan, Anand Gavai, Renata Guizzardi, Jos van Hillegersberg
Smart food systems generate vast and diverse data across the supply chain, yet inconsistent data structures and limited interoperability hinder their full potential. Achieving semantic interoperability, where systems can exchange and interpret data with shared meaning, is essential for enabling intelligent integration and decision-making. Tools such as ontologies, knowledge graphs, and reasoning engines play a key role in this process. In this paper, we refer to these as Semantic Interoperability (SI) tools: a broad category that includes technologies grounded in Semantic Web standards (e.g., RDF, OWL, SPARQL) but emphasizes their applied role in aligning meaning across heterogeneous systems. Coupled with eXplainable Artificial Intelligence (XAI), these technologies enhance transparency and trust in AI-driven decisions, such as personalized food recommendations tailored to an individual’s health conditions and preferences. This paper presents a Systematic Literature Review (SLR) examining the role of semantic interoperability tools and XAI in the development of smart food systems. Through an analysis of 39 studies, the review identifies key semantic technologies and XAI methods used in food systems, with a focus on their application in intelligent food recommendation systems. The findings reveal that while significant progress has been made, current systems often lack adequate transparency and personalization, limiting user trust and engagement. To address these gaps, the paper proposes the integration of semantic interoperability tools with XAI to create smarter, more reliable food systems. As part of this effort, the paper introduces the conceptual model for the Semantic Explainable Food Recommendation Ontology (SEFRO), a work-in-progress ontology, designed to connect entities and relationships within food systems in an intelligent manner, with the goal of enabling personalized, explainable, and interoperable food recommendations that meet the growing demands for smart food systems.
智能食品系统在整个供应链中产生大量不同的数据,但不一致的数据结构和有限的互操作性阻碍了它们充分发挥潜力。实现语义互操作性(系统可以交换和解释具有共享含义的数据)对于实现智能集成和决策至关重要。诸如本体、知识图和推理引擎之类的工具在这个过程中起着关键作用。在本文中,我们将这些工具称为语义互操作性(Semantic Interoperability, SI)工具:这是一个广泛的类别,包括基于语义Web标准的技术(例如,RDF、OWL、SPARQL),但强调它们在跨异构系统调整含义方面的应用作用。再加上可解释人工智能(XAI),这些技术提高了人工智能驱动决策的透明度和信任度,例如根据个人健康状况和偏好量身定制的个性化食品建议。本文介绍了一篇系统文献综述(SLR),研究了语义互操作性工具和人工智能在智能食品系统开发中的作用。通过对39项研究的分析,本文确定了食品系统中使用的关键语义技术和XAI方法,并重点介绍了它们在智能食品推荐系统中的应用。调查结果显示,虽然取得了重大进展,但目前的系统往往缺乏足够的透明度和个性化,限制了用户的信任和参与。为了解决这些差距,本文提出将语义互操作性工具与XAI集成,以创建更智能、更可靠的食品系统。作为这项工作的一部分,本文介绍了语义可解释食品推荐本体(SEFRO)的概念模型,SEFRO是一个正在开发的本体,旨在以智能的方式连接食品系统中的实体和关系,目标是实现个性化、可解释和可互操作的食品推荐,以满足对智能食品系统日益增长的需求。
{"title":"The Role and Applications of Semantic Interoperability Tools and eXplainable AI in the Development of Smart Food Systems: Findings from a Systematic Literature Review","authors":"Donika Xhani,&nbsp;Gayane Sedrakyan,&nbsp;Anand Gavai,&nbsp;Renata Guizzardi,&nbsp;Jos van Hillegersberg","doi":"10.1016/j.iswa.2025.200547","DOIUrl":"10.1016/j.iswa.2025.200547","url":null,"abstract":"<div><div>Smart food systems generate vast and diverse data across the supply chain, yet inconsistent data structures and limited interoperability hinder their full potential. Achieving semantic interoperability, where systems can exchange and interpret data with shared meaning, is essential for enabling intelligent integration and decision-making. Tools such as ontologies, knowledge graphs, and reasoning engines play a key role in this process. In this paper, we refer to these as <em>Semantic Interoperability (SI) tools</em>: a broad category that includes technologies grounded in Semantic Web standards (e.g., RDF, OWL, SPARQL) but emphasizes their applied role in aligning meaning across heterogeneous systems. Coupled with eXplainable Artificial Intelligence (XAI), these technologies enhance transparency and trust in AI-driven decisions, such as personalized food recommendations tailored to an individual’s health conditions and preferences. This paper presents a Systematic Literature Review (SLR) examining the role of semantic interoperability tools and XAI in the development of smart food systems. Through an analysis of 39 studies, the review identifies key semantic technologies and XAI methods used in food systems, with a focus on their application in intelligent food recommendation systems. The findings reveal that while significant progress has been made, current systems often lack adequate transparency and personalization, limiting user trust and engagement. To address these gaps, the paper proposes the integration of semantic interoperability tools with XAI to create smarter, more reliable food systems. As part of this effort, the paper introduces the conceptual model for the Semantic Explainable Food Recommendation Ontology (SEFRO), a work-in-progress ontology, designed to connect entities and relationships within food systems in an intelligent manner, with the goal of enabling personalized, explainable, and interoperable food recommendations that meet the growing demands for smart food systems.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200547"},"PeriodicalIF":0.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-source multi-layer-based transfer learning approach for forecasting customer demands of newly launched products 基于多源多层的新产品客户需求预测迁移学习方法
Pub Date : 2025-06-27 DOI: 10.1016/j.iswa.2025.200548
Supriyo Ahmed , Ripon K. Chakrabortty , Daryl L. Essam
Forecasting the future demand for newly launched products has been challenging for supply chain practitioners, often due to the lack of data. However, market surveys and extracting knowledge by examining similar market products to find the behaviour of a new product can be inaccurate and lead to erroneous results, which ultimately lead to a misestimation of the overall cost of a business. Meanwhile, with the advancement of artificial intelligence (AI) approaches, such as Transfer Learning (TL), this misestimation of cost can be reduced by more accurately forecasting the demand for newly launched products by seeking knowledge from the historical data of other similar products. Consequently, this paper investigates several classical AI-based TL approaches to predict customer demand for new products and stores. Thereafter, a novel Multi-Source Multi-Layer Transfer Learning approach with a Recursive Feature Elimination (MSML-TL-RFE) strategy is proposed to exploit the knowledge extraction power of the model from multiple sources for different days-ahead-prediction, distinguishing itself from the other investigated approaches. In this paper, an abstract concept of a supply chain, with information sharing among retailers, is investigated to show that such concepts can escalate the knowledge transfer ability of a system. A hierarchical two-echelon supply chain model with different attributes is developed to validate the proposed MSML-TL-RFE approach against a few other TL-based forecasting approaches. The feature-rich datasets are then transformed in such a way that they depict a hierarchical supply chain structure, allowing for the effective application of TL for forecasting consumer demand for recently introduced products. Continuing with that idea of information sharing, finding comparable sources for a quick and effective knowledge transfer procedure is investigated, considering all the peculiarities of a certain data set. MSML-TL-REF predictions and other TL-based approaches are analysed by calculating overall supply chain costs. Based on overall supply chain costs under static and dynamic lead time settings, the effectiveness and applicability of the proposed MSML-TL-RFE against traditional forecasting approaches are demonstrated. Incorporating MSML-TL-RFE with three sources improves accuracy, defined as the reciprocal of Root Mean Square Error (RMSE), from 4.83 (no TL) to 5.67 and further increases to 5.76 with additional sources, enabling more accurate predictions and reduced supply chain costs for businesses.
对于供应链从业者来说,预测新产品的未来需求一直是一个挑战,通常是由于缺乏数据。然而,市场调查和通过检查类似的市场产品来发现新产品的行为来提取知识可能是不准确的,并导致错误的结果,最终导致对企业总体成本的错误估计。同时,随着迁移学习(TL)等人工智能方法的进步,通过从其他类似产品的历史数据中寻找知识,更准确地预测新产品的需求,可以减少这种对成本的错误估计。因此,本文研究了几种经典的基于人工智能的TL方法来预测顾客对新产品和新商店的需求。在此基础上,提出了一种基于递归特征消除(MSML-TL-RFE)策略的多源多层迁移学习方法,利用模型从多源中提取不同日前预测的知识能力,区别于其他研究方法。本文研究了具有零售商之间信息共享的供应链的抽象概念,证明了这种概念可以提升系统的知识转移能力。建立了一个具有不同属性的分层两级供应链模型,以验证所提出的MSML-TL-RFE方法与其他几种基于tl的预测方法。然后以这样一种方式转换特征丰富的数据集,即它们描述分层供应链结构,从而允许有效地应用TL来预测消费者对最近推出的产品的需求。考虑到特定数据集的所有特性,继续使用信息共享的思想,研究如何为快速有效的知识转移过程找到可比较的来源。通过计算整体供应链成本来分析MSML-TL-REF预测和其他基于tl的方法。基于静态和动态前置时间设定下的整体供应链成本,验证了MSML-TL-RFE相对于传统预测方法的有效性和适用性。将MSML-TL-RFE与三个来源结合可以提高准确性,定义为均方根误差(RMSE)的倒数,从4.83(无TL)增加到5.67,并进一步增加到5.76,从而实现更准确的预测并降低企业的供应链成本。
{"title":"A multi-source multi-layer-based transfer learning approach for forecasting customer demands of newly launched products","authors":"Supriyo Ahmed ,&nbsp;Ripon K. Chakrabortty ,&nbsp;Daryl L. Essam","doi":"10.1016/j.iswa.2025.200548","DOIUrl":"10.1016/j.iswa.2025.200548","url":null,"abstract":"<div><div>Forecasting the future demand for newly launched products has been challenging for supply chain practitioners, often due to the lack of data. However, market surveys and extracting knowledge by examining similar market products to find the behaviour of a new product can be inaccurate and lead to erroneous results, which ultimately lead to a misestimation of the overall cost of a business. Meanwhile, with the advancement of artificial intelligence (AI) approaches, such as Transfer Learning (TL), this misestimation of cost can be reduced by more accurately forecasting the demand for newly launched products by seeking knowledge from the historical data of other similar products. Consequently, this paper investigates several classical AI-based TL approaches to predict customer demand for new products and stores. Thereafter, a novel <strong>M</strong>ulti-<strong>S</strong>ource <strong>M</strong>ulti-<strong>L</strong>ayer <strong>T</strong>ransfer <strong>L</strong>earning approach with a <strong>R</strong>ecursive <strong>F</strong>eature <strong>E</strong>limination (MSML-TL-RFE) strategy is proposed to exploit the knowledge extraction power of the model from multiple sources for different days-ahead-prediction, distinguishing itself from the other investigated approaches. In this paper, an abstract concept of a supply chain, with information sharing among retailers, is investigated to show that such concepts can escalate the knowledge transfer ability of a system. A hierarchical two-echelon supply chain model with different attributes is developed to validate the proposed MSML-TL-RFE approach against a few other TL-based forecasting approaches. The feature-rich datasets are then transformed in such a way that they depict a hierarchical supply chain structure, allowing for the effective application of TL for forecasting consumer demand for recently introduced products. Continuing with that idea of information sharing, finding comparable sources for a quick and effective knowledge transfer procedure is investigated, considering all the peculiarities of a certain data set. MSML-TL-REF predictions and other TL-based approaches are analysed by calculating overall supply chain costs. Based on overall supply chain costs under static and dynamic lead time settings, the effectiveness and applicability of the proposed MSML-TL-RFE against traditional forecasting approaches are demonstrated. Incorporating MSML-TL-RFE with three sources improves accuracy, defined as the reciprocal of Root Mean Square Error (RMSE), from 4.83 (no TL) to 5.67 and further increases to 5.76 with additional sources, enabling more accurate predictions and reduced supply chain costs for businesses.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200548"},"PeriodicalIF":0.0,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144516953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated learning-enabled lightweight intrusion detection system for wireless sensor networks: A cybersecurity approach against DDoS attacks in smart city environments 用于无线传感器网络的联邦学习轻量级入侵检测系统:智能城市环境中针对DDoS攻击的网络安全方法
Pub Date : 2025-06-26 DOI: 10.1016/j.iswa.2025.200553
Manu Devi , Priyanka Nandal , Harkesh Sehrawat

Background

Wireless Sensor Networks (WSNs) are vital in applications such as healthcare, smart cities, and environmental monitoring, but are vulnerable to cyberattacks due to their resource-constrained nature. Traditional Intrusion Detection Systems (IDS) depend on centralized architectures, which increase communication overhead and privacy risks and create a single point of failure.

Objective

This paper proposes a novel Federated Learning-based Lightweight IDS (FL-LIDS) that utilizes optimized lightweight models to enable real-time, privacy-preserving DDoS attack detection in resource-constrained WSNs for smart city environments and presents a comprehensive comparative analysis of models to evaluate their effectiveness within the Federated Learning (FL) framework.

Methods

FL-LIDS utilizes the optimized lightweight deep learning models for intrusion detection, which provides effective anomaly recognition with minimal resource usage, making it suitable for resource-limited WSN environments. The lightweight methods are evaluated in terms of their efficiency on the TON-IoT dataset.

Results

The study demonstrates the effectiveness of various FL-LIDS in detecting and preventing DDoS attacks with high detection rates and minimal latency. Metrics used to examine performance include accuracy, F1-score, precision, and recall in emulated WSN scenarios. The lightweight deep learning architecture optimizes accuracy and computational cost, with the lightweight hybrid CNN + LSTM model achieving superior intrusion detection performance, making it ideal for WSN-based smart city environments.

Conclusion

These cybersecurity systems provide a highly scalable and high-strength means of protecting smart city ecosystems in order to offer uninterrupted service provisioning. This research indicates that the FL provides an effective cybersecurity solution for WSNs.
无线传感器网络(wsn)在医疗保健、智慧城市和环境监测等应用中至关重要,但由于其资源受限的性质,容易受到网络攻击。传统的入侵检测系统(IDS)依赖于集中式体系结构,这增加了通信开销和隐私风险,并造成单点故障。本文提出了一种新的基于联邦学习的轻量级IDS (FL- lid),该IDS利用优化的轻量级模型,在智慧城市环境中资源受限的wsn中实现实时、保护隐私的DDoS攻击检测,并对模型进行了全面的比较分析,以评估其在联邦学习(FL)框架内的有效性。方法利用优化后的轻量级深度学习模型进行入侵检测,以最小的资源占用提供有效的异常识别,适用于资源有限的WSN环境。轻量级方法根据其在TON-IoT数据集上的效率进行评估。结果研究证明了各种fl - lid在检测和预防DDoS攻击方面的有效性,并且具有高检测率和最小延迟。用于检查性能的指标包括模拟WSN场景中的准确性、f1分数、精度和召回率。轻量级的深度学习架构优化了准确性和计算成本,轻量级的CNN + LSTM混合模型实现了卓越的入侵检测性能,是基于wsn的智慧城市环境的理想选择。这些网络安全系统提供了一种高度可扩展和高强度的保护智慧城市生态系统的手段,以提供不间断的服务供应。研究表明,FL为wsn提供了一种有效的网络安全解决方案。
{"title":"Federated learning-enabled lightweight intrusion detection system for wireless sensor networks: A cybersecurity approach against DDoS attacks in smart city environments","authors":"Manu Devi ,&nbsp;Priyanka Nandal ,&nbsp;Harkesh Sehrawat","doi":"10.1016/j.iswa.2025.200553","DOIUrl":"10.1016/j.iswa.2025.200553","url":null,"abstract":"<div><h3>Background</h3><div>Wireless Sensor Networks (WSNs) are vital in applications such as healthcare, smart cities, and environmental monitoring, but are vulnerable to cyberattacks due to their resource-constrained nature. Traditional Intrusion Detection Systems (IDS) depend on centralized architectures, which increase communication overhead and privacy risks and create a single point of failure.</div></div><div><h3>Objective</h3><div>This paper proposes a novel Federated Learning-based Lightweight IDS (FL-LIDS) that utilizes optimized lightweight models to enable real-time, privacy-preserving DDoS attack detection in resource-constrained WSNs for smart city environments and presents a comprehensive comparative analysis of models to evaluate their effectiveness within the Federated Learning (FL) framework.</div></div><div><h3>Methods</h3><div>FL-LIDS utilizes the optimized lightweight deep learning models for intrusion detection, which provides effective anomaly recognition with minimal resource usage, making it suitable for resource-limited WSN environments. The lightweight methods are evaluated in terms of their efficiency on the TON-IoT dataset.</div></div><div><h3>Results</h3><div>The study demonstrates the effectiveness of various FL-LIDS in detecting and preventing DDoS attacks with high detection rates and minimal latency. Metrics used to examine performance include accuracy, F1-score, precision, and recall in emulated WSN scenarios. The lightweight deep learning architecture optimizes accuracy and computational cost, with the lightweight hybrid CNN + LSTM model achieving superior intrusion detection performance, making it ideal for WSN-based smart city environments.</div></div><div><h3>Conclusion</h3><div>These cybersecurity systems provide a highly scalable and high-strength means of protecting smart city ecosystems in order to offer uninterrupted service provisioning. This research indicates that the FL provides an effective cybersecurity solution for WSNs.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200553"},"PeriodicalIF":0.0,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A real-time semantic segmentation network leveraging spatial and contextual features for enhanced scene understanding 利用空间和上下文特征增强场景理解的实时语义分割网络
Pub Date : 2025-06-24 DOI: 10.1016/j.iswa.2025.200542
Haifeng Sima , Meng Gao , Lanlan Liu
Real-time semantic segmentation of images requires both rich contextual and accurate spatial information. However, Multiple downsampling in deep convolutional neural networks often lead to loss of such information, resulting in reduced segmentation accuracy. To address the above problems, we propose SPCONet, a lightweight real-time semantic segmentation network that integrates spatial and contextual features. The network incorporates three key modules: (1) a Spatial Feature Aggregation Module (SFAM) that captures fine spatial details from shallow layers using spatially separable convolutions with multiple kernel sizes; (2) a Contextual Information Retrieval Module (CIRM) that extracts semantic context from deeper layers using dynamic convolution; (3) an Attention Fusion Module (AFM) that combines spatial and contextual features via local and global attention mechanisms. Quantitative experiments show that SPCONet achieves 77.5% and 75.3% mIoU at 74 FPS and 82 FPS on the Cityscapes and CamVid datasets, respectively. These results suggest that SPCONet provides an effective balance between segmentation accuracy and real-time inference capability.
图像的实时语义分割既需要丰富的上下文信息,又需要准确的空间信息。然而,在深度卷积神经网络中,多次降采样往往会导致这些信息的丢失,从而降低分割精度。为了解决上述问题,我们提出了SPCONet,一个集成了空间和上下文特征的轻量级实时语义分割网络。该网络包含三个关键模块:(1)空间特征聚合模块(sfm),该模块使用具有多个核大小的空间可分离卷积从浅层捕获精细空间细节;(2)上下文信息检索模块(CIRM),利用动态卷积从更深层提取语义上下文;(3)通过局部和全局注意机制将空间特征和上下文特征结合起来的注意融合模块(AFM)。定量实验表明,在cityscape和CamVid数据集上,SPCONet在74 FPS和82 FPS下的mIoU分别达到77.5%和75.3%。这些结果表明,SPCONet在分割精度和实时推理能力之间提供了有效的平衡。
{"title":"A real-time semantic segmentation network leveraging spatial and contextual features for enhanced scene understanding","authors":"Haifeng Sima ,&nbsp;Meng Gao ,&nbsp;Lanlan Liu","doi":"10.1016/j.iswa.2025.200542","DOIUrl":"10.1016/j.iswa.2025.200542","url":null,"abstract":"<div><div>Real-time semantic segmentation of images requires both rich contextual and accurate spatial information. However, Multiple downsampling in deep convolutional neural networks often lead to loss of such information, resulting in reduced segmentation accuracy. To address the above problems, we propose SPCONet, a lightweight real-time semantic segmentation network that integrates spatial and contextual features. The network incorporates three key modules: (1) a Spatial Feature Aggregation Module (SFAM) that captures fine spatial details from shallow layers using spatially separable convolutions with multiple kernel sizes; (2) a Contextual Information Retrieval Module (CIRM) that extracts semantic context from deeper layers using dynamic convolution; (3) an Attention Fusion Module (AFM) that combines spatial and contextual features via local and global attention mechanisms. Quantitative experiments show that SPCONet achieves 77.5% and 75.3% mIoU at 74 FPS and 82 FPS on the Cityscapes and CamVid datasets, respectively. These results suggest that SPCONet provides an effective balance between segmentation accuracy and real-time inference capability.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200542"},"PeriodicalIF":0.0,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Intelligent Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1