首页 > 最新文献

Intelligent Systems with Applications最新文献

英文 中文
Federated learning using quality-based aggregation method for brain tumour segmentation on multimodality medical images 基于质量的聚合方法的联邦学习在多模态医学图像上的脑肿瘤分割
IF 4.3 Pub Date : 2025-11-08 DOI: 10.1016/j.iswa.2025.200601
Rim El Badaoui , Ester Bonmati , Vasileios Argyriou , Barbara Villarini
Deep learning for medical imaging has shown great potential in improving patient outcomes due to its high accuracy in disease diagnosis. However, a major challenge preventing the widespread adoption of such models in clinical settings is data accessibility, which conflicts with the General Data Protection Regulation (GDPR) in a traditional centralised training environment. Hence, to address this issue, Federated Learning (FL) was introduced as a decentralised alternative that enables collaborative model training among data owners without sharing any private data. Despite its significance in healthcare, limited research has explored FL for medical imaging, particularly in multimodal brain tumour segmentation, due to challenges such as data heterogeneity.
In this study, we present Federated E-CATBraTS, an advanced federated deep learning model derived from the existing E-CATBraTS framework. This model is designed to segment brain tumours from multimodal magnetic resonance imaging (MRI) while preserving data privacy. Our framework introduces a novel aggregation method, DaQAvg, which optimally combines model weights based on data size and quality, demonstrating resilience against corrupted medical images.
We evaluated the performance of Federated E-CATBraTS using two publicly available datasets: UPenn-GBM and UCSF-PDGM, including a degraded version of the latter to assess the efficacy of our aggregation method. The results indicate a 6% overall improvement over traditional centralised approaches. Furthermore, we conducted a comprehensive comparison against state-of-the-art FL aggregation algorithms, including FedAVG, FedProx and FedNova. While FedNova demonstrated the highest overall DSC, DaQAvg demonstrated superior robustness to noisy conditions, showcasing its specific advantage in maintaining performance with variable data quality, a critical aspect in medical imaging.
医学成像的深度学习由于其在疾病诊断中的高准确性,在改善患者预后方面显示出巨大的潜力。然而,阻碍此类模型在临床环境中广泛采用的主要挑战是数据可访问性,这与传统集中式培训环境中的通用数据保护条例(GDPR)相冲突。因此,为了解决这个问题,联邦学习(FL)作为一种分散的替代方案被引入,它可以在数据所有者之间进行协作模型训练,而无需共享任何私有数据。尽管它在医疗保健方面具有重要意义,但由于数据异质性等挑战,有限的研究探索了FL用于医学成像,特别是在多模态脑肿瘤分割方面。在本研究中,我们提出了联邦E-CATBraTS,这是一种源自现有E-CATBraTS框架的高级联邦深度学习模型。该模型旨在从多模态磁共振成像(MRI)中分割脑肿瘤,同时保护数据隐私。我们的框架引入了一种新的聚合方法DaQAvg,该方法基于数据大小和质量优化地组合了模型权重,展示了对损坏医学图像的弹性。我们使用两个公开可用的数据集来评估联邦e - catbrat的性能:UPenn-GBM和UCSF-PDGM,包括后者的降级版本来评估我们的聚合方法的有效性。结果表明,与传统的集中式方法相比,总体改善了6%。此外,我们还与最先进的FL聚合算法(包括FedAVG、FedProx和FedNova)进行了全面比较。FedNova表现出最高的总体DSC, DaQAvg表现出对噪声条件的卓越鲁棒性,展示了其在保持可变数据质量方面的特定优势,这是医学成像的一个关键方面。
{"title":"Federated learning using quality-based aggregation method for brain tumour segmentation on multimodality medical images","authors":"Rim El Badaoui ,&nbsp;Ester Bonmati ,&nbsp;Vasileios Argyriou ,&nbsp;Barbara Villarini","doi":"10.1016/j.iswa.2025.200601","DOIUrl":"10.1016/j.iswa.2025.200601","url":null,"abstract":"<div><div>Deep learning for medical imaging has shown great potential in improving patient outcomes due to its high accuracy in disease diagnosis. However, a major challenge preventing the widespread adoption of such models in clinical settings is data accessibility, which conflicts with the General Data Protection Regulation (GDPR) in a traditional centralised training environment. Hence, to address this issue, Federated Learning (FL) was introduced as a decentralised alternative that enables collaborative model training among data owners without sharing any private data. Despite its significance in healthcare, limited research has explored FL for medical imaging, particularly in multimodal brain tumour segmentation, due to challenges such as data heterogeneity.</div><div>In this study, we present Federated E-CATBraTS, an advanced federated deep learning model derived from the existing E-CATBraTS framework. This model is designed to segment brain tumours from multimodal magnetic resonance imaging (MRI) while preserving data privacy. Our framework introduces a novel aggregation method, DaQAvg, which optimally combines model weights based on data size and quality, demonstrating resilience against corrupted medical images.</div><div>We evaluated the performance of Federated E-CATBraTS using two publicly available datasets: UPenn-GBM and UCSF-PDGM, including a degraded version of the latter to assess the efficacy of our aggregation method. The results indicate a 6% overall improvement over traditional centralised approaches. Furthermore, we conducted a comprehensive comparison against state-of-the-art FL aggregation algorithms, including FedAVG, FedProx and FedNova. While FedNova demonstrated the highest overall DSC, DaQAvg demonstrated superior robustness to noisy conditions, showcasing its specific advantage in maintaining performance with variable data quality, a critical aspect in medical imaging.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200601"},"PeriodicalIF":4.3,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145520079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond algorithms: Artificial intelligence driven talent identification with human insight 超越算法:人工智能驱动的人才识别与人类的洞察力
IF 4.3 Pub Date : 2025-11-07 DOI: 10.1016/j.iswa.2025.200604
Tiago Jacob Fernandes França , José Henrique Pereira São Mamede , João Manuel Pereira Barroso , Vítor Manuel Pereira Duarte dos Santos
The rapid evolution of Artificial Intelligence (AI) is reshaping Human Resource Management (HRM), with growing interest in its role in talent identification. While AI has demonstrated effectiveness in analysing structured data, its limitations in assessing qualitative attributes such as creativity, adaptability, and emotional intelligence remain underexplored. This study addresses these gaps through an exploratory mixed-methods design, combining a global survey (n = 240) with semi-structured interviews of HR professionals. Quantitative analysis highlights patterns of association between key competencies, while qualitative findings provide contextual insights into perceptions of fairness, bias, and cultural resistance. The results suggest that AI can complement, but not replace, human judgement, supporting a Hybrid Evaluative Model that integrates algorithmic efficiency with human interpretation. The study contributes rare empirical evidence to a nascent field, highlights the ethical imperatives of bias mitigation and transparency, and underscores the importance of cultural context (collectivist versus individualist orientations) in shaping the acceptance and effectiveness of AI-enabled HR practices. These findings offer practical guidance for organisations and advance theory-building at the intersection of AI and HRM.
人工智能(AI)的快速发展正在重塑人力资源管理(HRM),人们对其在人才识别中的作用越来越感兴趣。虽然人工智能在分析结构化数据方面已经证明了有效性,但它在评估创造力、适应性和情商等定性属性方面的局限性仍未得到充分探索。本研究通过探索性混合方法设计,将全球调查(n = 240)与人力资源专业人员的半结构化访谈相结合,解决了这些差距。定量分析强调了关键能力之间的关联模式,而定性研究结果提供了对公平、偏见和文化阻力感知的背景见解。结果表明,人工智能可以补充而不是取代人类的判断,支持将算法效率与人类解释相结合的混合评估模型。该研究为这一新兴领域提供了罕见的经验证据,强调了减少偏见和透明度的伦理必要性,并强调了文化背景(集体主义与个人主义取向)在塑造人工智能人力资源实践的接受度和有效性方面的重要性。这些发现为组织提供了实践指导,并推进了人工智能和人力资源管理交叉领域的理论建设。
{"title":"Beyond algorithms: Artificial intelligence driven talent identification with human insight","authors":"Tiago Jacob Fernandes França ,&nbsp;José Henrique Pereira São Mamede ,&nbsp;João Manuel Pereira Barroso ,&nbsp;Vítor Manuel Pereira Duarte dos Santos","doi":"10.1016/j.iswa.2025.200604","DOIUrl":"10.1016/j.iswa.2025.200604","url":null,"abstract":"<div><div>The rapid evolution of Artificial Intelligence (AI) is reshaping Human Resource Management (HRM), with growing interest in its role in talent identification. While AI has demonstrated effectiveness in analysing structured data, its limitations in assessing qualitative attributes such as creativity, adaptability, and emotional intelligence remain underexplored. This study addresses these gaps through an exploratory mixed-methods design, combining a global survey (<em>n</em> = 240) with semi-structured interviews of HR professionals. Quantitative analysis highlights patterns of association between key competencies, while qualitative findings provide contextual insights into perceptions of fairness, bias, and cultural resistance. The results suggest that AI can complement, but not replace, human judgement, supporting a Hybrid Evaluative Model that integrates algorithmic efficiency with human interpretation. The study contributes rare empirical evidence to a nascent field, highlights the ethical imperatives of bias mitigation and transparency, and underscores the importance of cultural context (collectivist versus individualist orientations) in shaping the acceptance and effectiveness of AI-enabled HR practices. These findings offer practical guidance for organisations and advance theory-building at the intersection of AI and HRM.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200604"},"PeriodicalIF":4.3,"publicationDate":"2025-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145571742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Alert correlation for intelligent threat detection and response 警报关联智能威胁检测和响应
IF 4.3 Pub Date : 2025-11-07 DOI: 10.1016/j.iswa.2025.200606
Bronagh Lanigan , Zeinab Rezaeifar , Federico Cruciani , Michael Milliken , Jordan Vincent , Samuel Moore , Muhammad Aaqib , Alan Mills , Pushpinder K. Chouhan , Alfie Beard , Chris D. Nugent , Luke Chen , Alex Healing
With the increasing diversity of IoT devices, keeping IT systems secure is becoming increasingly difficult. Attackers exploit vulnerabilities within the system in order to access sensitive information, typically reaching their objective through several steps. Current Intrusion Detection Systems (IDSs) focus on low-level alerts, and tend to produce a high rate of false positives. This type of information alone is insufficient for the detection of sophisticated attack scenarios such Advanced Persistent Threats (APTs). Consequently, correlation techniques have recently been introduced to correlate alerts and reconstruct attack scenarios, however, various attack scenarios exist, with diverse characteristics. Also, different steps of the APTs scenarios may have their own characteristics. Therefore, finding a proper method that covers all cases remains a challenge. Moreover, after detecting APTs, how the system should respond to these attacks to avoid sabotage to the system remains a challenge. Thus, in this paper, first for detection of the attacks, we classify different cases, and then, a method based on different characteristics of attack patterns is proposed to detect APT scenarios. The proposed method consists of two main phases: APT detection and the intelligent hybrid response framework. In APT detection phase, similar alerts are aggregated and attack graphs are generated based on a similarity matrix. These graphs, combined with third party API data enable alert correlation and APT scenario detection. Entity graphs are then created to visualise host behaviour, and alert graphs are analysed to detect APT scenarios. In the response phase, attack graphs produced from the correlation inform the hybrid response framework, integrating knowledge and data-driven components that facilitate automated or recommended mitigation. The approach was evaluated on the ZeekData24 dataset. Obtained precision and recall on the malicious traffic was observed to be 96.65% and 87.04% respectively. The results show that our approach can effectively filter false positive alerts with a reduction of the data going from 10,063 alerts daily to 586 meta-alerts, pruned to 48 attack graphs and finally reduced to 20 suspicious attack graphs.
随着物联网设备的日益多样化,保持IT系统的安全变得越来越困难。攻击者利用系统中的漏洞来访问敏感信息,通常通过几个步骤来达到他们的目标。当前的入侵检测系统(ids)侧重于低级警报,容易产生高误报率。这种类型的信息本身不足以检测复杂的攻击场景,例如高级持续威胁(apt)。因此,最近引入了相关技术来关联警报和重建攻击场景,然而,存在各种攻击场景,具有不同的特征。此外,apt场景的不同步骤可能有自己的特点。因此,找到一种适用于所有情况的合适方法仍然是一项挑战。此外,在检测到apt之后,系统应该如何响应这些攻击以避免对系统的破坏仍然是一个挑战。因此,本文首先对攻击进行检测,对不同的案例进行分类,然后提出一种基于攻击模式不同特征的APT场景检测方法。该方法包括两个主要阶段:APT检测和智能混合响应框架。在APT检测阶段,基于相似矩阵聚合相似警报并生成攻击图。这些图表与第三方API数据相结合,可以实现警报关联和APT场景检测。然后创建实体图来可视化主机行为,并分析警报图以检测APT场景。在响应阶段,根据相关性生成的攻击图为混合响应框架提供信息,整合知识和数据驱动组件,促进自动化或推荐的缓解措施。该方法在ZeekData24数据集上进行了评估。对恶意流量的检测准确率和召回率分别为96.65%和87.04%。结果表明,我们的方法可以有效地过滤假阳性警报,将数据从每天10,063个警报减少到586个元警报,修剪到48个攻击图,最终减少到20个可疑攻击图。
{"title":"Alert correlation for intelligent threat detection and response","authors":"Bronagh Lanigan ,&nbsp;Zeinab Rezaeifar ,&nbsp;Federico Cruciani ,&nbsp;Michael Milliken ,&nbsp;Jordan Vincent ,&nbsp;Samuel Moore ,&nbsp;Muhammad Aaqib ,&nbsp;Alan Mills ,&nbsp;Pushpinder K. Chouhan ,&nbsp;Alfie Beard ,&nbsp;Chris D. Nugent ,&nbsp;Luke Chen ,&nbsp;Alex Healing","doi":"10.1016/j.iswa.2025.200606","DOIUrl":"10.1016/j.iswa.2025.200606","url":null,"abstract":"<div><div>With the increasing diversity of IoT devices, keeping IT systems secure is becoming increasingly difficult. Attackers exploit vulnerabilities within the system in order to access sensitive information, typically reaching their objective through several steps. Current Intrusion Detection Systems (IDSs) focus on low-level alerts, and tend to produce a high rate of false positives. This type of information alone is insufficient for the detection of sophisticated attack scenarios such Advanced Persistent Threats (APTs). Consequently, correlation techniques have recently been introduced to correlate alerts and reconstruct attack scenarios, however, various attack scenarios exist, with diverse characteristics. Also, different steps of the APTs scenarios may have their own characteristics. Therefore, finding a proper method that covers all cases remains a challenge. Moreover, after detecting APTs, how the system should respond to these attacks to avoid sabotage to the system remains a challenge. Thus, in this paper, first for detection of the attacks, we classify different cases, and then, a method based on different characteristics of attack patterns is proposed to detect APT scenarios. The proposed method consists of two main phases: APT detection and the intelligent hybrid response framework. In APT detection phase, similar alerts are aggregated and attack graphs are generated based on a similarity matrix. These graphs, combined with third party API data enable alert correlation and APT scenario detection. Entity graphs are then created to visualise host behaviour, and alert graphs are analysed to detect APT scenarios. In the response phase, attack graphs produced from the correlation inform the hybrid response framework, integrating knowledge and data-driven components that facilitate automated or recommended mitigation. The approach was evaluated on the ZeekData24 dataset. Obtained precision and recall on the malicious traffic was observed to be 96.65% and 87.04% respectively. The results show that our approach can effectively filter false positive alerts with a reduction of the data going from 10,063 alerts daily to 586 meta-alerts, pruned to 48 attack graphs and finally reduced to 20 suspicious attack graphs.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200606"},"PeriodicalIF":4.3,"publicationDate":"2025-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145520065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sentiment analysis: From rule-based lexicons to large language models 情感分析:从基于规则的词汇到大型语言模型
IF 4.3 Pub Date : 2025-11-07 DOI: 10.1016/j.iswa.2025.200599
Maikel Leon
This study provides a comprehensive review of two decades of research in opinion mining and sentiment analysis, addressing the fragmentation of prior work across methodologies, application domains, and data sources. The evolution of the field is traced from pre-1990 rule-based systems to lexicon heuristics, statistical learning, machine learning, deep learning, and the current wave of transformer-driven, multimodal, and generative models. Applications are examined across marketing, finance, politics, and social media, with emphasis on how methodological innovations have improved accuracy and enabled broader adoption. Best practices – including transformer fine-tuning, prompt engineering, zero-shot and few-shot learning, multimodal fusion, and domain adaptation – are analyzed to distill evidence-based guidelines for researchers and practitioners. The synthesis shows how sentiment analysis has shaped critical areas, including brand management, investor decision-making, political discourse, and online user engagement. Findings highlight the effectiveness of transformer-based approaches, particularly when combined with domain adaptation and prompt engineering, in delivering state-of-the-art performance. Beyond methodological and applied insights, the study identifies promising directions for future research, including real-time customer journey analytics, explainability in generative AI, robustness across multiple languages, ethical implications, and sustainability considerations. By consolidating dispersed knowledge into a unified account, this review provides both historical grounding and a structured roadmap that advances theoretical understanding and informs managerial practice.
本研究对二十年来在意见挖掘和情感分析方面的研究进行了全面的回顾,解决了以前在方法、应用领域和数据源方面工作的碎片化问题。该领域的发展可以追溯到1990年以前基于规则的系统,到词汇启发式、统计学习、机器学习、深度学习,以及当前的变压器驱动、多模态和生成模型。应用程序将在营销、金融、政治和社交媒体领域进行审查,重点是方法创新如何提高准确性并使其得到更广泛的采用。本文分析了最佳实践——包括变压器微调、快速工程、零采样和少采样学习、多模态融合和领域适应——为研究人员和实践者提炼出基于证据的指导方针。这份综合报告显示了情感分析是如何影响关键领域的,包括品牌管理、投资者决策、政治话语和在线用户参与。研究结果强调了基于变压器的方法的有效性,特别是当与领域适应和快速工程相结合时,在提供最先进的性能方面。除了方法论和应用见解之外,该研究还确定了未来研究的有希望的方向,包括实时客户旅程分析、生成式人工智能的可解释性、跨多种语言的稳健性、伦理影响和可持续性考虑。通过将分散的知识整合成一个统一的账户,本综述提供了历史基础和结构化的路线图,以推进理论理解并为管理实践提供信息。
{"title":"Sentiment analysis: From rule-based lexicons to large language models","authors":"Maikel Leon","doi":"10.1016/j.iswa.2025.200599","DOIUrl":"10.1016/j.iswa.2025.200599","url":null,"abstract":"<div><div>This study provides a comprehensive review of two decades of research in opinion mining and sentiment analysis, addressing the fragmentation of prior work across methodologies, application domains, and data sources. The evolution of the field is traced from pre-1990 rule-based systems to lexicon heuristics, statistical learning, machine learning, deep learning, and the current wave of transformer-driven, multimodal, and generative models. Applications are examined across marketing, finance, politics, and social media, with emphasis on how methodological innovations have improved accuracy and enabled broader adoption. Best practices – including transformer fine-tuning, prompt engineering, zero-shot and few-shot learning, multimodal fusion, and domain adaptation – are analyzed to distill evidence-based guidelines for researchers and practitioners. The synthesis shows how sentiment analysis has shaped critical areas, including brand management, investor decision-making, political discourse, and online user engagement. Findings highlight the effectiveness of transformer-based approaches, particularly when combined with domain adaptation and prompt engineering, in delivering state-of-the-art performance. Beyond methodological and applied insights, the study identifies promising directions for future research, including real-time customer journey analytics, explainability in generative AI, robustness across multiple languages, ethical implications, and sustainability considerations. By consolidating dispersed knowledge into a unified account, this review provides both historical grounding and a structured roadmap that advances theoretical understanding and informs managerial practice.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200599"},"PeriodicalIF":4.3,"publicationDate":"2025-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive analysis on laser spots adversarial attacks using genetic algorithm 基于遗传算法的激光光点对抗性攻击综合分析
IF 4.3 Pub Date : 2025-11-01 DOI: 10.1016/j.iswa.2025.200598
Youssef Mansour , Ayad Turky , Ibrahim Abaker Hashem , Imad Afyouni , Ali Bou Nassif , Ismail Shahin , Ashraf Elnagar
Deep Neural Networks (DNNs) are highly vulnerable to disruptions caused by minimal noise, yet research on physical attacks leveraging light-based methods remains scarce. Light-based physical attacks are exceptionally stealthy, posing substantial security threats to vision-dependent applications such as autonomous driving. This paper enhances a state-of-the-art light-based physical attack that employs a genetic algorithm to optimize laser spot placement for maximum effectiveness. We expand the algorithm by introducing additional hyperparameters and systematically optimizing them to establish the most efficient workflow for this problem. To our knowledge, this is the first light-based attack capable of reliably performing physical attacks during daylight conditions, making it the most effective and robust approach of its kind. Extensive experiments conducted in a digital environment demonstrate the superiority of the genetic algorithm over random-location methods. By identifying optimal hyperparameter values, we achieve significant improvements in both performance and efficiency. Specifically, we managed to achieve an Attack Success Rate (ASR) of 89.7%, with an Average Query (AQ) of only 109.4, demonstrating a highly efficient and effective approach. The results reveal that laser spots can severely interfere with advanced DNNs, highlighting the critical security risks associated with this technique.
深度神经网络(dnn)极易受到微小噪声造成的干扰,但利用基于光的方法进行物理攻击的研究仍然很少。基于光的物理攻击非常隐蔽,对自动驾驶等依赖视觉的应用构成了巨大的安全威胁。本文改进了一种最先进的基于光的物理攻击,该攻击采用遗传算法来优化激光光斑的放置,以获得最大的效果。我们通过引入额外的超参数来扩展算法,并对它们进行系统优化,以建立最有效的工作流程。据我们所知,这是第一次基于光的攻击,能够在白天条件下可靠地执行物理攻击,使其成为同类中最有效和最强大的方法。在数字环境中进行的大量实验证明了遗传算法比随机定位方法的优越性。通过识别最优的超参数值,我们在性能和效率方面都取得了显著的改进。具体来说,我们设法实现了89.7%的攻击成功率(ASR),平均查询(AQ)仅为109.4,证明了一种高效有效的方法。结果表明,激光光斑可以严重干扰高级dnn,突出了与该技术相关的关键安全风险。
{"title":"Comprehensive analysis on laser spots adversarial attacks using genetic algorithm","authors":"Youssef Mansour ,&nbsp;Ayad Turky ,&nbsp;Ibrahim Abaker Hashem ,&nbsp;Imad Afyouni ,&nbsp;Ali Bou Nassif ,&nbsp;Ismail Shahin ,&nbsp;Ashraf Elnagar","doi":"10.1016/j.iswa.2025.200598","DOIUrl":"10.1016/j.iswa.2025.200598","url":null,"abstract":"<div><div>Deep Neural Networks (DNNs) are highly vulnerable to disruptions caused by minimal noise, yet research on physical attacks leveraging light-based methods remains scarce. Light-based physical attacks are exceptionally stealthy, posing substantial security threats to vision-dependent applications such as autonomous driving. This paper enhances a state-of-the-art light-based physical attack that employs a genetic algorithm to optimize laser spot placement for maximum effectiveness. We expand the algorithm by introducing additional hyperparameters and systematically optimizing them to establish the most efficient workflow for this problem. To our knowledge, this is the first light-based attack capable of reliably performing physical attacks during daylight conditions, making it the most effective and robust approach of its kind. Extensive experiments conducted in a digital environment demonstrate the superiority of the genetic algorithm over random-location methods. By identifying optimal hyperparameter values, we achieve significant improvements in both performance and efficiency. Specifically, we managed to achieve an Attack Success Rate (ASR) of 89.7%, with an Average Query (AQ) of only 109.4, demonstrating a highly efficient and effective approach. The results reveal that laser spots can severely interfere with advanced DNNs, highlighting the critical security risks associated with this technique.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200598"},"PeriodicalIF":4.3,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CTR-Net: Scalable safe reinforcement learning via neural approximations of control theoretic regulators 基于神经逼近控制理论调节器的可扩展安全强化学习
IF 4.3 Pub Date : 2025-10-31 DOI: 10.1016/j.iswa.2025.200597
Ramen Ghosh
Ensuring hard constraint satisfaction during both training and deployment is central to safety-critical reinforcement learning (RL). Control-theoretic regularization (CTR) enforces safety by filtering actions through viability- or barrier-certified safe sets, but evaluating the state-dependent regulator R(x) online is often prohibitive in high dimensions. We propose a scalable CTR framework based on neural regulator approximators Rˆθ(x)—differentiable surrogates of R(x) that enable fast projection or rejection-sampling filters within standard RL loops. We formalize a learning-theoretic analysis for approximate safety filtering and prove probably approximately correct (PAC)-style guarantees: if the set approximation error is bounded by ɛ with confidence 1δ, then the probability of constraint violation along a length–T rollout is bounded by a term that scales linearly in T and ɛ (plus δ). We further show that the performance suboptimality of the filtered policy is controlled analytically by the same approximation envelope, yielding an explicit, provably quantified safety-versus-optimality tradeoff (PAC bounds linear in T and the envelope), complemented by empirical ablations; see also the calculus-of-variations view of constrained tradeoffs (Younis, 2023). The resulting method, CTR-Net, is architecture-agnostic and supports real-time execution via fast, differentiable safety layers. Empirical evaluations on high-dimensional continuous-control benchmarks — including safe locomotion and constrained multi-joint manipulation — demonstrate reliable constraint satisfaction during learning and deployment, robustness under modeling uncertainty and substantial computational gains relative to exact viability/barrier baselines. By coupling operator-free neural safety sets with CTR guarantees, CTR-Net bridges theoretical safety certificates and scalable implementation, advancing practical, real-time safe RL for complex intelligent systems.
在训练和部署期间确保硬约束的满足是安全关键型强化学习(RL)的核心。控制理论正则化(CTR)通过生存能力或障碍认证的安全集过滤动作来加强安全性,但是在线评估状态相关的调节器R(x)在高维中通常是令人望而却步的。我们提出了一个可扩展的CTR框架,该框架基于神经调节器近似器R θ(x) - R(x)的可微替代品,可以在标准RL环路内实现快速投影或拒绝采样滤波器。我们形式化了近似安全过滤的学习理论分析,并证明了可能近似正确(PAC)风格的保证:如果集合近似误差以置信度为1−δ的π为界,那么沿长度- T rollout的约束违反概率由一个在T和π (+ δ)中线性缩放的项为界。我们进一步表明,过滤策略的性能次优性由相同的近似包络分析控制,产生明确的,可证明的量化安全与最优性权衡(PAC界在T和包络中是线性的),辅以经验消融;另见约束权衡的变分演算观点(Younis, 2023)。由此产生的方法cnet与体系结构无关,并通过快速、可区分的安全层支持实时执行。对高维连续控制基准(包括安全运动和约束多关节操作)的经验评估表明,在学习和部署过程中,约束满足是可靠的,建模不确定性下的鲁棒性和相对于确切的可行性/障碍基线的大量计算收益。通过将无操作人员的神经安全集与CTR保证相结合,CTR- net将理论安全证书与可扩展的实施相结合,为复杂的智能系统推进实用、实时的安全RL。
{"title":"CTR-Net: Scalable safe reinforcement learning via neural approximations of control theoretic regulators","authors":"Ramen Ghosh","doi":"10.1016/j.iswa.2025.200597","DOIUrl":"10.1016/j.iswa.2025.200597","url":null,"abstract":"<div><div>Ensuring hard constraint satisfaction during both training and deployment is central to safety-critical reinforcement learning (RL). Control-theoretic regularization (CTR) enforces safety by filtering actions through viability- or barrier-certified safe sets, but evaluating the state-dependent regulator <span><math><mrow><mi>R</mi><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></math></span> online is often prohibitive in high dimensions. We propose a scalable CTR framework based on <em>neural regulator approximators</em> <span><math><mrow><msub><mrow><mover><mrow><mi>R</mi></mrow><mrow><mo>ˆ</mo></mrow></mover></mrow><mrow><mi>θ</mi></mrow></msub><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></math></span>—differentiable surrogates of <span><math><mrow><mi>R</mi><mrow><mo>(</mo><mi>x</mi><mo>)</mo></mrow></mrow></math></span> that enable fast projection or rejection-sampling filters within standard RL loops. We formalize a learning-theoretic analysis for approximate safety filtering and prove probably approximately correct (PAC)-style guarantees: if the set approximation error is bounded by <span><math><mi>ɛ</mi></math></span> with confidence <span><math><mrow><mn>1</mn><mo>−</mo><mi>δ</mi></mrow></math></span>, then the probability of constraint violation along a length–<span><math><mi>T</mi></math></span> rollout is bounded by a term that scales linearly in <span><math><mi>T</mi></math></span> and <span><math><mi>ɛ</mi></math></span> (plus <span><math><mi>δ</mi></math></span>). We further show that the performance suboptimality of the filtered policy is controlled analytically by the same approximation envelope, yielding an explicit, provably quantified safety-versus-optimality tradeoff (PAC bounds linear in <span><math><mi>T</mi></math></span> and the envelope), complemented by empirical ablations; see also the calculus-of-variations view of constrained tradeoffs (Younis, 2023). The resulting method, <strong>CTR-Net</strong>, is architecture-agnostic and supports real-time execution via fast, differentiable safety layers. Empirical evaluations on high-dimensional continuous-control benchmarks — including safe locomotion and constrained multi-joint manipulation — demonstrate reliable constraint satisfaction during learning and deployment, robustness under modeling uncertainty and substantial computational gains relative to exact viability/barrier baselines. By coupling operator-free neural safety sets with CTR guarantees, CTR-Net bridges theoretical safety certificates and scalable implementation, advancing practical, real-time safe RL for complex intelligent systems.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200597"},"PeriodicalIF":4.3,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145416658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Indirect visual odometry with a light-field camera 使用光场摄像机的间接视觉里程计
IF 4.3 Pub Date : 2025-10-31 DOI: 10.1016/j.iswa.2025.200600
Mohamad Al Assaad, Stéphane Bazeille, Christophe Cudel
Visual odometry is the technique of determining a robot’s pose by analyzing images of its surroundings as it moves. Visual odometry can be categorized into monocular when using a single camera, or stereo when using two cameras or more. In this study, we investigate the use of light-field camera for visual odometry. Capitalizing on the distinctive capability of a light-field camera to record both the intensity and the direction of light, we propose an indirect visual odometry method able to estimate the scale of the translation similarly to stereo visual odometry, but using a single camera sensor. Our visual odometry framework combines light-field imaging with conventional odometry techniques to track the camera movements, using the depth insights provided by a light-field depth estimation approach. Additionally, this method differs from state-of-the-art methods by using a simplified calibration process and a new keypoints extraction method, which makes the use of the light-field cameras easier for robotics perception.
视觉里程计是一种通过分析机器人运动时周围环境的图像来确定机器人姿势的技术。视觉里程计可以分为单目,当使用一个相机,或立体,当使用两个或更多的相机。在这项研究中,我们探讨了使用光场相机的视觉里程计。利用光场相机记录光的强度和方向的独特能力,我们提出了一种间接视觉里程计方法,能够估计平移的规模,类似于立体视觉里程计,但使用单个相机传感器。我们的视觉里程计框架将光场成像与传统的里程计技术相结合,利用光场深度估计方法提供的深度洞察来跟踪相机运动。此外,该方法与最先进的方法不同,它使用了简化的校准过程和新的关键点提取方法,这使得光场相机的使用更容易用于机器人感知。
{"title":"Indirect visual odometry with a light-field camera","authors":"Mohamad Al Assaad,&nbsp;Stéphane Bazeille,&nbsp;Christophe Cudel","doi":"10.1016/j.iswa.2025.200600","DOIUrl":"10.1016/j.iswa.2025.200600","url":null,"abstract":"<div><div>Visual odometry is the technique of determining a robot’s pose by analyzing images of its surroundings as it moves. Visual odometry can be categorized into monocular when using a single camera, or stereo when using two cameras or more. In this study, we investigate the use of light-field camera for visual odometry. Capitalizing on the distinctive capability of a light-field camera to record both the intensity and the direction of light, we propose an indirect visual odometry method able to estimate the scale of the translation similarly to stereo visual odometry, but using a single camera sensor. Our visual odometry framework combines light-field imaging with conventional odometry techniques to track the camera movements, using the depth insights provided by a light-field depth estimation approach. Additionally, this method differs from state-of-the-art methods by using a simplified calibration process and a new keypoints extraction method, which makes the use of the light-field cameras easier for robotics perception.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200600"},"PeriodicalIF":4.3,"publicationDate":"2025-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-end semantically aware tactile generation 端到端语义感知触觉生成
IF 4.3 Pub Date : 2025-10-28 DOI: 10.1016/j.iswa.2025.200594
Mohammad Mahdi Heydari Dastjerdi, Abbas Akkasi, Hilaire Djani, Aatreyi Pranavbhai Mehta, Majid Komeili
Tactile graphics are an essential tool for conveying visual information to visually impaired individuals. However, translating 2D plots, such as B’ezier curves, polygons, and bar charts, into an effective tactile format remains a challenge. This paper presents a novel, two-stage deep learning pipeline for automating this conversion process. Our method leverages a Pix2Pix architecture, employing a U-Net++ generator network for robust image generation. To improve the perceptual quality of the tactile representations, we incorporate an adversarial perceptual loss function alongside a gradient penalty. The pipeline operates in a sequential manner: firstly, converting the source plot into a grayscale tactile representation, followed by a transformation into a channel-wise equivalent. We evaluate the performance of our model on a comprehensive synthetic dataset consisting of 20,000 source-target pairs encompassing various 2D plot types. To quantify performance, we utilize fuzzy versions of established metrics like pixel accuracy, Dice coefficient, and Jaccard index. Additionally, a human study is conducted to assess the visual quality of the generated tactile graphics. The proposed approach demonstrates promising results, significantly streamlining the conversion of 2D plots into tactile graphics. This paves the way for the development of fully automated systems, enhancing accessibility of visual information for visually impaired individuals.
触觉图形是向视障人士传达视觉信息的重要工具。然而,将二维图形(如B’ezier曲线、多边形和条形图)转换成有效的触觉格式仍然是一个挑战。本文提出了一种新颖的两阶段深度学习管道,用于自动化此转换过程。我们的方法利用Pix2Pix架构,采用U-Net++生成器网络进行鲁棒图像生成。为了提高触觉表征的感知质量,我们结合了一个对抗感知损失函数和一个梯度惩罚。管道以顺序的方式运行:首先,将源图转换为灰度触觉表示,然后转换为通道等效。我们在包含各种2D图类型的20,000对源-目标对的综合合成数据集上评估了我们的模型的性能。为了量化性能,我们使用模糊版本的既定指标,如像素精度,骰子系数和Jaccard指数。此外,还进行了人体研究,以评估生成的触觉图形的视觉质量。所提出的方法显示了有希望的结果,显着简化了二维图形到触觉图形的转换。这为开发全自动系统铺平了道路,增强了视障人士获取视觉信息的能力。
{"title":"End-to-end semantically aware tactile generation","authors":"Mohammad Mahdi Heydari Dastjerdi,&nbsp;Abbas Akkasi,&nbsp;Hilaire Djani,&nbsp;Aatreyi Pranavbhai Mehta,&nbsp;Majid Komeili","doi":"10.1016/j.iswa.2025.200594","DOIUrl":"10.1016/j.iswa.2025.200594","url":null,"abstract":"<div><div>Tactile graphics are an essential tool for conveying visual information to visually impaired individuals. However, translating 2D plots, such as B’ezier curves, polygons, and bar charts, into an effective tactile format remains a challenge. This paper presents a novel, two-stage deep learning pipeline for automating this conversion process. Our method leverages a Pix2Pix architecture, employing a U-Net++ generator network for robust image generation. To improve the perceptual quality of the tactile representations, we incorporate an adversarial perceptual loss function alongside a gradient penalty. The pipeline operates in a sequential manner: firstly, converting the source plot into a grayscale tactile representation, followed by a transformation into a channel-wise equivalent. We evaluate the performance of our model on a comprehensive synthetic dataset consisting of 20,000 source-target pairs encompassing various 2D plot types. To quantify performance, we utilize fuzzy versions of established metrics like pixel accuracy, Dice coefficient, and Jaccard index. Additionally, a human study is conducted to assess the visual quality of the generated tactile graphics. The proposed approach demonstrates promising results, significantly streamlining the conversion of 2D plots into tactile graphics. This paves the way for the development of fully automated systems, enhancing accessibility of visual information for visually impaired individuals.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200594"},"PeriodicalIF":4.3,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145416791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable Artificial Intelligence: A systematic Review of Progress and Challenges 可解释的人工智能:进步与挑战的系统回顾
IF 4.3 Pub Date : 2025-10-27 DOI: 10.1016/j.iswa.2025.200595
Azza Mohamed , Khaled Abdelqader , Khaled Shaalan
This work employs a multidisciplinary approach to identify research gaps in the existing literature by presenting a systematic review of systematic reviews on Explainable Artificial Intelligence (XAI). To the best of our knowledge, this is the first thorough meta-review that combines the findings of several excellent reviews to offer a more elevated viewpoint on the goals and difficulties facing the area. The review covers empirical studies published between 2021 and 2023, focusing on high-quality sources. An initial pool of 997 entries was screened across multiple databases, yielding 928 unique articles after duplicate removal. Ultimately, 14 studies met the inclusion criteria and were analyzed in depth. The quality assessment confirmed that all selected reviews adhered to established methodological standards. The key findings show XAI's broad uses, which range from increasing trust and transparency to assisting with financial and management decision-making. The prevalence of healthcare-focused studies emphasizes XAI's importance in enhancing interpretability, fairness, regulatory compliance, and personalized treatment options. Commonly used techniques include visual explanation tools, interpretable machine learning models, and model-agnostic approaches. While the review offers valuable insights, it acknowledges limitations such as its reliance on Q1 journals and the exclusion of broader sources, which may affect comprehensiveness. To advance the field, the study recommends expanding future research to underrepresented domains like autonomous vehicles, defense, and smart cities. It also calls for methodological innovation to enhance accessibility, fairness, privacy, and the development of intuitive explanation strategies. Addressing these gaps can significantly improve the trustworthiness and effectiveness of AI systems across sectors.
这项工作采用多学科方法,通过对可解释人工智能(XAI)的系统综述进行系统综述,来确定现有文献中的研究空白。据我们所知,这是第一个全面的元综述,它结合了几篇优秀综述的发现,为该领域面临的目标和困难提供了一个更高的观点。该综述涵盖了2021年至2023年期间发表的实证研究,重点关注高质量来源。在多个数据库中筛选997个条目的初始池,删除重复后产生928个唯一条目。最终有14项研究符合纳入标准,并进行了深入分析。质量评估确认所有选定的审查都遵守既定的方法标准。主要发现表明,XAI的用途广泛,从增加信任和透明度到协助财务和管理决策。以医疗保健为重点的研究的流行强调了XAI在增强可解释性、公平性、法规遵从性和个性化治疗选择方面的重要性。常用的技术包括可视化解释工具、可解释的机器学习模型和模型不可知论方法。虽然这篇综述提供了有价值的见解,但它也承认其局限性,比如它依赖于Q1期刊和排除了更广泛的来源,这可能会影响全面性。为了推进该领域的发展,该研究建议将未来的研究扩展到无人驾驶汽车、国防和智能城市等代表性不足的领域。它还呼吁方法创新,以提高可访问性,公平性,隐私性,并发展直观的解释策略。解决这些差距可以显著提高各部门人工智能系统的可信度和有效性。
{"title":"Explainable Artificial Intelligence: A systematic Review of Progress and Challenges","authors":"Azza Mohamed ,&nbsp;Khaled Abdelqader ,&nbsp;Khaled Shaalan","doi":"10.1016/j.iswa.2025.200595","DOIUrl":"10.1016/j.iswa.2025.200595","url":null,"abstract":"<div><div>This work employs a multidisciplinary approach to identify research gaps in the existing literature by presenting a systematic review of systematic reviews on Explainable Artificial Intelligence (XAI). To the best of our knowledge, this is the first thorough meta-review that combines the findings of several excellent reviews to offer a more elevated viewpoint on the goals and difficulties facing the area. The review covers empirical studies published between 2021 and 2023, focusing on high-quality sources. An initial pool of 997 entries was screened across multiple databases, yielding 928 unique articles after duplicate removal. Ultimately, 14 studies met the inclusion criteria and were analyzed in depth. The quality assessment confirmed that all selected reviews adhered to established methodological standards. The key findings show XAI's broad uses, which range from increasing trust and transparency to assisting with financial and management decision-making. The prevalence of healthcare-focused studies emphasizes XAI's importance in enhancing interpretability, fairness, regulatory compliance, and personalized treatment options. Commonly used techniques include visual explanation tools, interpretable machine learning models, and model-agnostic approaches. While the review offers valuable insights, it acknowledges limitations such as its reliance on Q1 journals and the exclusion of broader sources, which may affect comprehensiveness. To advance the field, the study recommends expanding future research to underrepresented domains like autonomous vehicles, defense, and smart cities. It also calls for methodological innovation to enhance accessibility, fairness, privacy, and the development of intuitive explanation strategies. Addressing these gaps can significantly improve the trustworthiness and effectiveness of AI systems across sectors.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200595"},"PeriodicalIF":4.3,"publicationDate":"2025-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusing explainable deep learning ensembles and LLM recommendations for real-time plant leaf disease diagnosis 融合可解释的深度学习集合和LLM建议用于实时植物叶片疾病诊断
IF 4.3 Pub Date : 2025-10-24 DOI: 10.1016/j.iswa.2025.200596
Dip Kumar Saha , Mohammad Rasel Ahmed , Tushar Deb Nath , Rounakul Islam Boby , Md. Jakir Hossen , M.F. Mridha
Timely and accurate identification of plant leaf diseases plays a vital role in ensuring sustainable agriculture and universal food security. Accurate identification of plant leaf diseases ensures healthier plant cultivation, which is pivotal for sustainable agriculture operations. In this study, we present a plant leaf disease recognition mechanism that utilizes a stacking ensemble structure combined with a Large Language Model (LLM) and Explainable AI (XAI) mechanism to improve identification accuracy and comprehensibility. To capture high textural structure, we utilized the Gray Level Co-occurrence Matrix (GLCM), whereas the MobileNetV3 architecture was utilized to maintain low computational cost in feature extraction. GoogleNet was integrated to improve multi-scale feature extraction by employing inception blocks, which effectively obtain fine-grained details and universal spatial patterns. Our ensemble framework integrates improved versions of MobileNetV3, GoogleNet, and ConvNeXtSmall with CatBoost employed as a nonlinear meta-learner allowing the framework to effectively capture complex connections among the base models within the ensemble framework. Moreover, we utilized additional CNN models, including AlexNet and EfficientNetV2B0, to compare the result of our proposed stacking ensemble model and to evaluate its generalization ability over various architectural designs. In addition, we developed a real-time system integrating an LLM with the proposed ensemble model, ensuring automatic plant leaf disease recognition and delivering corresponding curing recommendations. Our findings contribute to plant-based agriculture by enabling early diagnosis of leaf diseases and providing real-time recommendations through DL and LLM technology.
及时准确地识别植物叶片病害对确保可持续农业和普遍粮食安全具有至关重要的作用。准确识别植物叶片病害可确保更健康的植物种植,这对可持续农业经营至关重要。在本研究中,我们提出了一种植物叶片病害识别机制,该机制利用堆叠集成结构结合大语言模型(Large Language Model, LLM)和可解释人工智能(explable AI, XAI)机制来提高识别精度和可理解性。为了捕获高纹理结构,我们使用了灰度共生矩阵(GLCM),而在特征提取中,我们使用了MobileNetV3架构来保持较低的计算成本。集成GoogleNet,利用初始块改进多尺度特征提取,有效获取细粒度细节和通用空间模式。我们的集成框架集成了MobileNetV3、GoogleNet和ConvNeXtSmall的改进版本,并使用CatBoost作为非线性元学习器,允许框架有效地捕获集成框架内基本模型之间的复杂连接。此外,我们使用了额外的CNN模型,包括AlexNet和EfficientNetV2B0,来比较我们提出的堆叠集成模型的结果,并评估其在各种建筑设计上的泛化能力。此外,我们还开发了一个实时系统,将LLM与所提出的集成模型集成在一起,确保植物叶片病害的自动识别并提供相应的养护建议。我们的研究结果有助于植物农业,通过DL和LLM技术实现叶片疾病的早期诊断并提供实时建议。
{"title":"Fusing explainable deep learning ensembles and LLM recommendations for real-time plant leaf disease diagnosis","authors":"Dip Kumar Saha ,&nbsp;Mohammad Rasel Ahmed ,&nbsp;Tushar Deb Nath ,&nbsp;Rounakul Islam Boby ,&nbsp;Md. Jakir Hossen ,&nbsp;M.F. Mridha","doi":"10.1016/j.iswa.2025.200596","DOIUrl":"10.1016/j.iswa.2025.200596","url":null,"abstract":"<div><div>Timely and accurate identification of plant leaf diseases plays a vital role in ensuring sustainable agriculture and universal food security. Accurate identification of plant leaf diseases ensures healthier plant cultivation, which is pivotal for sustainable agriculture operations. In this study, we present a plant leaf disease recognition mechanism that utilizes a stacking ensemble structure combined with a Large Language Model (LLM) and Explainable AI (XAI) mechanism to improve identification accuracy and comprehensibility. To capture high textural structure, we utilized the Gray Level Co-occurrence Matrix (GLCM), whereas the MobileNetV3 architecture was utilized to maintain low computational cost in feature extraction. GoogleNet was integrated to improve multi-scale feature extraction by employing inception blocks, which effectively obtain fine-grained details and universal spatial patterns. Our ensemble framework integrates improved versions of MobileNetV3, GoogleNet, and ConvNeXtSmall with CatBoost employed as a nonlinear meta-learner allowing the framework to effectively capture complex connections among the base models within the ensemble framework. Moreover, we utilized additional CNN models, including AlexNet and EfficientNetV2B0, to compare the result of our proposed stacking ensemble model and to evaluate its generalization ability over various architectural designs. In addition, we developed a real-time system integrating an LLM with the proposed ensemble model, ensuring automatic plant leaf disease recognition and delivering corresponding curing recommendations. Our findings contribute to plant-based agriculture by enabling early diagnosis of leaf diseases and providing real-time recommendations through DL and LLM technology.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"28 ","pages":"Article 200596"},"PeriodicalIF":4.3,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145416657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Intelligent Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1