首页 > 最新文献

Artificial Intelligence Review最新文献

英文 中文
Unifying ground and air: a comprehensive review of deep learning-enabled CAVs and UAVs 统一地面和空中:对支持深度学习的cav和uav的全面审查
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-20 DOI: 10.1007/s10462-025-11425-1
Muhammad Umer Zia, Wei Xiang, Tao Huang, Jameel Ahmad, Jawwad Nasar Chattha, Ijaz Haider Naqvi, Faran Awais Butt

The tremendous advancements in artificial intelligence (AI) techniques, particularly those pertinent to computer vision and image recognition, are revolutionizing the automotive industry towards the development of intelligent transportation systems for smart cities. Integrating AI techniques into connected autonomous vehicles (CAVs) and unmanned aerial vehicles (UAVs) and their data fusion, enables a new paradigm that allows for unparalleled real-time awareness of the surrounding environment. The potential of emerging wireless technologies can be fully exploited by establishing communication and cooperation among AI-augmented CAVs and UAVs. However, configuring appropriate deep learning (DL) models for connected vehicles is a complex task. Any errors can result in severe consequences, including loss of vehicles, infrastructure, and human lives. These systems are also susceptible to cyber attacks, necessitating a thorough and timely threat analysis and countermeasures to prevent catastrophic events. Our findings highlight the effectiveness of AI-driven data fusion in enhancing cooperative perception between CAVs and UAVs, identify security vulnerabilities in DL-based systems, and demonstrate how V2X-enabled UAVs can significantly improve situational awareness in corner cases.

人工智能(AI)技术的巨大进步,特别是与计算机视觉和图像识别相关的技术,正在彻底改变汽车行业,朝着智能城市智能交通系统的发展。将人工智能技术集成到连接的自动驾驶汽车(cav)和无人驾驶飞行器(uav)及其数据融合中,可以实现对周围环境无与伦比的实时感知的新范式。通过建立人工智能增强的自动驾驶汽车和无人机之间的通信和合作,可以充分利用新兴无线技术的潜力。然而,为联网车辆配置适当的深度学习(DL)模型是一项复杂的任务。任何错误都可能导致严重的后果,包括车辆、基础设施和人员的损失。这些系统也容易受到网络攻击,需要进行彻底和及时的威胁分析和对策,以防止灾难性事件的发生。我们的研究结果强调了人工智能驱动的数据融合在增强自动驾驶汽车和无人机之间的协同感知方面的有效性,识别了基于dl的系统中的安全漏洞,并展示了支持v2x的无人机如何在极端情况下显著提高态势感知。
{"title":"Unifying ground and air: a comprehensive review of deep learning-enabled CAVs and UAVs","authors":"Muhammad Umer Zia,&nbsp;Wei Xiang,&nbsp;Tao Huang,&nbsp;Jameel Ahmad,&nbsp;Jawwad Nasar Chattha,&nbsp;Ijaz Haider Naqvi,&nbsp;Faran Awais Butt","doi":"10.1007/s10462-025-11425-1","DOIUrl":"10.1007/s10462-025-11425-1","url":null,"abstract":"<div><p>The tremendous advancements in artificial intelligence (AI) techniques, particularly those pertinent to computer vision and image recognition, are revolutionizing the automotive industry towards the development of intelligent transportation systems for smart cities. Integrating AI techniques into connected autonomous vehicles (CAVs) and unmanned aerial vehicles (UAVs) and their data fusion, enables a new paradigm that allows for unparalleled real-time awareness of the surrounding environment. The potential of emerging wireless technologies can be fully exploited by establishing communication and cooperation among AI-augmented CAVs and UAVs. However, configuring appropriate deep learning (DL) models for connected vehicles is a complex task. Any errors can result in severe consequences, including loss of vehicles, infrastructure, and human lives. These systems are also susceptible to cyber attacks, necessitating a thorough and timely threat analysis and countermeasures to prevent catastrophic events. Our findings highlight the effectiveness of AI-driven data fusion in enhancing cooperative perception between CAVs and UAVs, identify security vulnerabilities in DL-based systems, and demonstrate how V2X-enabled UAVs can significantly improve situational awareness in corner cases.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11425-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145561195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advances and challenges in infrared-visible image fusion: a comprehensive review of techniques and applications 红外-可见图像融合的进展与挑战:技术与应用的综合综述
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-20 DOI: 10.1007/s10462-025-11426-0
Rongchao Wang, Zhaofa Zhou, Shuhui Li, Zhili Zhang

Infrared–visible image fusion (IVIF) integrates complementary thermal and photometric cues for surveillance, remote sensing, and autonomous perception. Existing surveys, while comprehensive, provide limited guidance for design-to-deployment and seldom relate fusion quality to task outcomes or device constraints. This work provides a unified perspective that organizes IVIF methods along an interface-attention-alignment coordinate system covering classical spatial/transform pipelines and contemporary deep paradigms (generative, discriminative, multi-task, hybrid/Transformer, dynamic). Building on literature through 2025, we synthesize fidelity-robustness-efficiency trade-offs and introduce a comparison-to-deployment protocol that couples fusion metrics with task accuracy (AP/mIoU), latency, memory footprint, and condition-performance characterization (misregistration, noise, illumination/weather). We consolidate Transformer/hybrid coverage with practical recipes and focused guidance on temporal consistency, robustness auditing, and physics-grounded interpretability. Compared with previous reviews, our survey concurrently addresses four under-covered dimensions-video temporal consistency, robustness auditing, task-aware evaluation, and deployment reporting-and distills a practical checklist linking architectural choices to operating conditions and hardware budgets, enabling reproducible, task-relevant IVIF practice.

红外-可见光图像融合(IVIF)集成了互补的热和光度线索,用于监视、遥感和自主感知。现有的调查虽然全面,但为从设计到部署提供的指导有限,而且很少将融合质量与任务结果或设备约束联系起来。这项工作提供了一个统一的视角,沿着一个涵盖经典空间/转换管道和当代深层范式(生成、判别、多任务、混合/变形、动态)的接口-注意-对齐坐标系统组织IVIF方法。在2025年之前的文献基础上,我们综合了保真度-鲁棒性-效率的权衡,并引入了一种比较部署协议,该协议将融合指标与任务精度(AP/mIoU)、延迟、内存占用和条件性能表征(误配、噪声、照明/天气)结合起来。我们将Transformer/hybrid的覆盖范围与实用的方法和集中在时间一致性、健壮性审计和基于物理的可解释性方面的指导结合起来。与之前的评论相比,我们的调查同时解决了四个未涵盖的维度——视频时间一致性、健壮性审计、任务感知评估和部署报告——并提取了一个实用的清单,将架构选择与操作条件和硬件预算联系起来,从而实现可重复的、与任务相关的IVIF实践。
{"title":"Advances and challenges in infrared-visible image fusion: a comprehensive review of techniques and applications","authors":"Rongchao Wang,&nbsp;Zhaofa Zhou,&nbsp;Shuhui Li,&nbsp;Zhili Zhang","doi":"10.1007/s10462-025-11426-0","DOIUrl":"10.1007/s10462-025-11426-0","url":null,"abstract":"<div><p>Infrared–visible image fusion (IVIF) integrates complementary thermal and photometric cues for surveillance, remote sensing, and autonomous perception. Existing surveys, while comprehensive, provide limited guidance for <i>design-to-deployment</i> and seldom relate fusion quality to task outcomes or device constraints. This work provides a unified perspective that organizes IVIF methods along an interface-attention-alignment coordinate system covering classical spatial/transform pipelines and contemporary deep paradigms (generative, discriminative, multi-task, hybrid/Transformer, dynamic). Building on literature through 2025, we synthesize fidelity-robustness-efficiency trade-offs and introduce a comparison-to-deployment protocol that couples fusion metrics with task accuracy (AP/mIoU), latency, memory footprint, and condition-performance characterization (misregistration, noise, illumination/weather). We consolidate Transformer/hybrid coverage with practical recipes and focused guidance on temporal consistency, robustness auditing, and physics-grounded interpretability. Compared with previous reviews, our survey concurrently addresses four under-covered dimensions-video temporal consistency, robustness auditing, task-aware evaluation, and deployment reporting-and distills a practical checklist linking architectural choices to operating conditions and hardware budgets, enabling reproducible, task-relevant IVIF practice.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11426-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145561011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge distillation and dataset distillation of large language models: emerging trends, challenges, and future directions 大型语言模型的知识蒸馏和数据集蒸馏:新兴趋势、挑战和未来方向
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-20 DOI: 10.1007/s10462-025-11423-3
Luyang Fang, Xiaowei Yu, Jiazhang Cai, Yongkai Chen, Shushan Wu, Zhengliang Liu, Zhenyuan Yang, Haoran Lu, Xilin Gong, Yufang Liu, Terry Ma, Wei Ruan, Ali Abbasi, Jing Zhang, Tao Wang, Ehsan Latif, Wei Liu, Wei Zhang, Soheil Kolouri, Xiaoming Zhai, Dajiang Zhu, Wenxuan Zhong, Tianming Liu, Ping Ma

The exponential growth of Large Language Models (LLMs) continues to highlight the need for efficient strategies to meet ever-expanding computational and data demands. This survey provides a comprehensive analysis of two complementary paradigms: Knowledge Distillation (KD) and Dataset Distillation (DD), both aimed at compressing LLMs while preserving their advanced reasoning capabilities and linguistic diversity. We first examine key methodologies in KD, such as task-specific alignment, rationale-based training, and multi-teacher frameworks, alongside DD techniques that synthesize compact, high-impact datasets through optimization-based gradient matching, latent space regularization, and generative synthesis. Building on these foundations, we explore how integrating KD and DD can produce more effective and scalable compression strategies. Together, these approaches address persistent challenges in model scalability, architectural heterogeneity, and the preservation of emergent LLM abilities. We further highlight applications across domains such as healthcare and education, where distillation enables efficient deployment without sacrificing performance. Despite substantial progress, open challenges remain in preserving emergent reasoning and linguistic diversity, enabling efficient adaptation to continually evolving teacher models and datasets, and establishing comprehensive evaluation protocols. By synthesizing methodological innovations, theoretical foundations, and practical insights, our survey charts a path toward sustainable, resource-efficient LLMs through the tighter integration of KD and DD principles.

大型语言模型(llm)的指数级增长继续突出了对有效策略的需求,以满足不断扩展的计算和数据需求。本调查提供了两个互补范式的全面分析:知识蒸馏(KD)和数据集蒸馏(DD),两者都旨在压缩法学硕士,同时保留其高级推理能力和语言多样性。我们首先研究了KD中的关键方法,如任务特定对齐、基于原理的培训和多教师框架,以及通过基于优化的梯度匹配、潜在空间正则化和生成合成合成紧凑、高影响力数据集的DD技术。在这些基础上,我们探讨了如何集成KD和DD来产生更有效和可扩展的压缩策略。总之,这些方法解决了模型可伸缩性、体系结构异构性和保存紧急LLM能力方面的持续挑战。我们进一步强调了医疗保健和教育等领域的应用程序,在这些领域,蒸馏可以在不牺牲性能的情况下实现高效部署。尽管取得了实质性进展,但在保持紧急推理和语言多样性、有效适应不断发展的教师模型和数据集、建立全面的评估协议等方面仍然存在开放性挑战。通过综合方法创新、理论基础和实践见解,我们的调查通过更紧密地整合KD和DD原则,为实现可持续、资源高效的法学硕士指明了道路。
{"title":"Knowledge distillation and dataset distillation of large language models: emerging trends, challenges, and future directions","authors":"Luyang Fang,&nbsp;Xiaowei Yu,&nbsp;Jiazhang Cai,&nbsp;Yongkai Chen,&nbsp;Shushan Wu,&nbsp;Zhengliang Liu,&nbsp;Zhenyuan Yang,&nbsp;Haoran Lu,&nbsp;Xilin Gong,&nbsp;Yufang Liu,&nbsp;Terry Ma,&nbsp;Wei Ruan,&nbsp;Ali Abbasi,&nbsp;Jing Zhang,&nbsp;Tao Wang,&nbsp;Ehsan Latif,&nbsp;Wei Liu,&nbsp;Wei Zhang,&nbsp;Soheil Kolouri,&nbsp;Xiaoming Zhai,&nbsp;Dajiang Zhu,&nbsp;Wenxuan Zhong,&nbsp;Tianming Liu,&nbsp;Ping Ma","doi":"10.1007/s10462-025-11423-3","DOIUrl":"10.1007/s10462-025-11423-3","url":null,"abstract":"<div><p>The exponential growth of Large Language Models (LLMs) continues to highlight the need for efficient strategies to meet ever-expanding computational and data demands. This survey provides a comprehensive analysis of two complementary paradigms: Knowledge Distillation (KD) and Dataset Distillation (DD), both aimed at compressing LLMs while preserving their advanced reasoning capabilities and linguistic diversity. We first examine key methodologies in KD, such as task-specific alignment, rationale-based training, and multi-teacher frameworks, alongside DD techniques that synthesize compact, high-impact datasets through optimization-based gradient matching, latent space regularization, and generative synthesis. Building on these foundations, we explore how integrating KD and DD can produce more effective and scalable compression strategies. Together, these approaches address persistent challenges in model scalability, architectural heterogeneity, and the preservation of emergent LLM abilities. We further highlight applications across domains such as healthcare and education, where distillation enables efficient deployment without sacrificing performance. Despite substantial progress, open challenges remain in preserving emergent reasoning and linguistic diversity, enabling efficient adaptation to continually evolving teacher models and datasets, and establishing comprehensive evaluation protocols. By synthesizing methodological innovations, theoretical foundations, and practical insights, our survey charts a path toward sustainable, resource-efficient LLMs through the tighter integration of KD and DD principles.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11423-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145561546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Pygmalion effect in AI: influence of cultural narratives and policies on technological development 人工智能中的皮格马利翁效应:文化叙事和政策对技术发展的影响
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1007/s10462-025-11407-3
T. J. Mateo Sanguino

Advances in generative artificial intelligence (AI), such as recent developments in text, audio, and video production, have amplified societal concerns, with threat probabilities estimated between 5 and 50%. This manuscript undertakes a comprehensive study to understand the factors influencing AI development, focusing on the interplay between AI research, cinematographic representations, and regulatory policies. The study reveals a strong interaction between scientific advances and cultural representations, indicating shared concerns and themes across both domains. It also highlights broad support for ethical and responsible AI development, with temporal analyses showing the significant influence of films on public perception and slower growth in policy implementation relative to cultural diffusion. The findings discuss the presence of a Pygmalion effect, where cultural representations shape perceptions of AI, and a potential Golem effect, where increased regulation may limit the dangerous development of AI and its societal impact. The study underscores the importance of balanced and ethical AI development, requiring continued monitoring and careful management of the relationship between research, cultural representations, and regulatory frameworks.

生成式人工智能(AI)的进步,如最近在文本、音频和视频制作方面的发展,加剧了社会的担忧,其威胁概率估计在5%至50%之间。本文进行了全面的研究,以了解影响人工智能发展的因素,重点是人工智能研究、电影表现和监管政策之间的相互作用。该研究揭示了科学进步和文化表现之间的强烈互动,表明这两个领域都有共同的关注点和主题。它还强调了对道德和负责任的人工智能发展的广泛支持,时间分析显示了电影对公众认知的重大影响,以及相对于文化传播,政策实施的缓慢增长。研究结果讨论了皮格马利翁效应(Pygmalion effect)的存在,即文化表征塑造了对人工智能的看法,以及潜在的魔像效应(Golem effect),即增加监管可能限制人工智能的危险发展及其社会影响。该研究强调了平衡和合乎道德的人工智能发展的重要性,需要持续监测和仔细管理研究、文化表征和监管框架之间的关系。
{"title":"The Pygmalion effect in AI: influence of cultural narratives and policies on technological development","authors":"T. J. Mateo Sanguino","doi":"10.1007/s10462-025-11407-3","DOIUrl":"10.1007/s10462-025-11407-3","url":null,"abstract":"<div><p>Advances in generative artificial intelligence (AI), such as recent developments in text, audio, and video production, have amplified societal concerns, with threat probabilities estimated between 5 and 50%. This manuscript undertakes a comprehensive study to understand the factors influencing AI development, focusing on the interplay between AI research, cinematographic representations, and regulatory policies. The study reveals a strong interaction between scientific advances and cultural representations, indicating shared concerns and themes across both domains. It also highlights broad support for ethical and responsible AI development, with temporal analyses showing the significant influence of films on public perception and slower growth in policy implementation relative to cultural diffusion. The findings discuss the presence of a Pygmalion effect, where cultural representations shape perceptions of AI, and a potential Golem effect, where increased regulation may limit the dangerous development of AI and its societal impact. The study underscores the importance of balanced and ethical AI development, requiring continued monitoring and careful management of the relationship between research, cultural representations, and regulatory frameworks.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11407-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145560954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the potential of explainable AI in brain tumor detection and classification: a systematic review 探索可解释的人工智能在脑肿瘤检测和分类中的潜力:系统综述
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1007/s10462-025-11410-8
Lincy Annet Abraham, Gopinath Palanisamy, Goutham Veerapu, J. S. Nisha

The analysis and treatment of brain tumors are among the most difficult medical conditions. Brain tumors must be detected accurately and promptly to improve patient outcomes and plan effective treatments. Recently used advanced technologies such as artificial intelligence (AI) and machine learning (ML) have increased interest in applying AI to detect brain tumors. However, concerns have emerged regarding the reliability and transparency of AI models in medical settings, as their decision-making processes are often opaque and difficult to interpret. This research is unique in its focus on explainability in AI-based brain tumor detection, prioritizing confidence, safety, and clinical adoption over mere accuracy. It gives a thorough overview of XAI methodologies, problems, and uses, linking scientific advances to the needs of real-world healthcare. XAI is a sub-section of artificial intelligence that seeks to solve this problem by offering understandable and straightforward and providing explanations for the choices made by AI representations. Applications such as healthcare, where the interpretability of AI models is essential for guaranteeing patient safety and fostering confidence between medical professionals and AI systems, have seen the introduction of XAI-based procedures. This paper reviews recent advancements in XAI-based brain tumor detection, focusing on methods that provide justifications for AI model predictions. The study highlights the advantages of XAI in improving patient outcomes and supporting medical decision-making. The findings reveal that ResNet 18 performed better, with 94% training accuracy, 96.86% testing accuracy, low loss (0.012), and a rapid time ((sim 6text {s})). ResNet 50 was a little slower ((sim 13text {s})) but stable, with 92.86% test accuracy. DenseNet121 (Adam W) achieved the highest accuracy at 97.71%, but it was not consistent across all optimizers. ViT-GRU also got 97% accuracy with very little loss (0.008), although it took a long time to compute (around 49 s). On the other hand, VGG models (around 94% test accuracy) and MobileNetV2 (loss up to 6.024) were less reliable, even though they trained faster. Additionally, it explores various opportunities, challenges, and clinical applications. Based on these findings, this research offers a comprehensive analysis of XAI-based brain tumor detection and encourages further investigation in specific areas.

脑肿瘤的分析和治疗是最困难的医疗条件之一。脑肿瘤必须准确、及时地检测出来,以改善患者的预后,并制定有效的治疗方案。最近,人工智能(AI)和机器学习(ML)等先进技术的应用使人们对人工智能在脑肿瘤检测中的应用越来越感兴趣。然而,人们对医疗环境中人工智能模型的可靠性和透明度感到担忧,因为它们的决策过程往往不透明且难以解释。这项研究的独特之处在于它专注于基于人工智能的脑肿瘤检测的可解释性,优先考虑信心、安全性和临床采用,而不仅仅是准确性。它全面概述了XAI方法、问题和用途,并将科学进步与现实世界的医疗保健需求联系起来。XAI是人工智能的一个分支,它试图通过提供可理解和直接的方法来解决这个问题,并为人工智能表示所做的选择提供解释。在医疗保健等应用中,人工智能模型的可解释性对于保证患者安全、培养医疗专业人员与人工智能系统之间的信任至关重要,这些应用已经引入了基于xai的程序。本文综述了基于xai的脑肿瘤检测的最新进展,重点介绍了为AI模型预测提供依据的方法。该研究强调了XAI在改善患者预后和支持医疗决策方面的优势。结果显示,ResNet 18的表现更好,为94% training accuracy, 96.86% testing accuracy, low loss (0.012), and a rapid time ((sim 6text {s})). ResNet 50 was a little slower ((sim 13text {s})) but stable, with 92.86% test accuracy. DenseNet121 (Adam W) achieved the highest accuracy at 97.71%, but it was not consistent across all optimizers. ViT-GRU also got 97% accuracy with very little loss (0.008), although it took a long time to compute (around 49 s). On the other hand, VGG models (around 94% test accuracy) and MobileNetV2 (loss up to 6.024) were less reliable, even though they trained faster. Additionally, it explores various opportunities, challenges, and clinical applications. Based on these findings, this research offers a comprehensive analysis of XAI-based brain tumor detection and encourages further investigation in specific areas.
{"title":"Exploring the potential of explainable AI in brain tumor detection and classification: a systematic review","authors":"Lincy Annet Abraham,&nbsp;Gopinath Palanisamy,&nbsp;Goutham Veerapu,&nbsp;J. S. Nisha","doi":"10.1007/s10462-025-11410-8","DOIUrl":"10.1007/s10462-025-11410-8","url":null,"abstract":"<div><p>The analysis and treatment of brain tumors are among the most difficult medical conditions. Brain tumors must be detected accurately and promptly to improve patient outcomes and plan effective treatments. Recently used advanced technologies such as artificial intelligence (AI) and machine learning (ML) have increased interest in applying AI to detect brain tumors. However, concerns have emerged regarding the reliability and transparency of AI models in medical settings, as their decision-making processes are often opaque and difficult to interpret. This research is unique in its focus on explainability in AI-based brain tumor detection, prioritizing confidence, safety, and clinical adoption over mere accuracy. It gives a thorough overview of XAI methodologies, problems, and uses, linking scientific advances to the needs of real-world healthcare. XAI is a sub-section of artificial intelligence that seeks to solve this problem by offering understandable and straightforward and providing explanations for the choices made by AI representations. Applications such as healthcare, where the interpretability of AI models is essential for guaranteeing patient safety and fostering confidence between medical professionals and AI systems, have seen the introduction of XAI-based procedures. This paper reviews recent advancements in XAI-based brain tumor detection, focusing on methods that provide justifications for AI model predictions. The study highlights the advantages of XAI in improving patient outcomes and supporting medical decision-making. The findings reveal that ResNet 18 performed better, with 94% training accuracy, 96.86% testing accuracy, low loss (0.012), and a rapid time <span>((sim 6text {s}))</span>. ResNet 50 was a little slower <span>((sim 13text {s}))</span> but stable, with 92.86% test accuracy. DenseNet121 (Adam W) achieved the highest accuracy at 97.71%, but it was not consistent across all optimizers. ViT-GRU also got 97% accuracy with very little loss (0.008), although it took a long time to compute (around 49 s). On the other hand, VGG models (around 94% test accuracy) and MobileNetV2 (loss up to 6.024) were less reliable, even though they trained faster. Additionally, it explores various opportunities, challenges, and clinical applications. Based on these findings, this research offers a comprehensive analysis of XAI-based brain tumor detection and encourages further investigation in specific areas.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11410-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145560956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Validation is the central challenge for generative social simulation: a critical review of LLMs in agent-based modeling 验证是生成社会模拟的核心挑战:对基于代理的建模的法学硕士的批判性回顾
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1007/s10462-025-11412-6
Maik Larooij, Petter Törnberg

Recent advances in Large Language Models (LLMs) have revitalized interest in Agent-Based Models (ABMs) by enabling “generative” simulations, with agents that can plan, reason, and interact through natural language. These developments promise greater realism and expressive power, but also revive long-standing concerns over empirical grounding, calibration, and validation—issues that have historically limited the uptake of ABMs in the social sciences. This paper systematically reviews the emerging literature on generative ABMs to assess how these long-standing challenges are being addressed. We map domains of application, categorize reported validation practices, and assess their alignment with the stated modeling goals. Our review suggests that the use of LLMs may exacerbate rather than alleviate the challenge of validating ABMs, given their black-box structure, cultural biases, and stochastic outputs. While the need for validation is increasingly acknowledged, studies often rely on face-validity or outcome measures that are only loosely tied to underlying mechanisms. Generative ABMs thus occupy an ambiguous methodological space—lacking both the parsimony of formal models and the empirical validity of data-driven approaches—and their contribution to cumulative social-scientific knowledge hinges on resolving this tension.

大型语言模型(llm)的最新进展通过启用“生成”模拟,使基于代理的模型(ABMs)重新受到关注,代理可以通过自然语言进行计划,推理和交互。这些发展承诺了更大的现实主义和表达能力,但也重新引发了对经验基础、校准和验证的长期关注,这些问题历来限制了社会科学对ABMs的吸收。本文系统地回顾了关于生成式ABMs的新兴文献,以评估如何解决这些长期存在的挑战。我们映射应用程序的领域,对报告的验证实践进行分类,并评估它们与声明的建模目标的一致性。我们的回顾表明,llm的使用可能会加剧而不是减轻验证abm的挑战,因为它们的黑箱结构、文化偏差和随机输出。虽然越来越多的人认识到验证的必要性,但研究往往依赖于面部效度或结果测量,而这些测量与潜在机制的联系并不紧密。因此,生成式ABMs占据了一个模糊的方法论空间——既缺乏形式模型的简约性,也缺乏数据驱动方法的经验有效性——它们对累积社会科学知识的贡献取决于解决这种紧张关系。
{"title":"Validation is the central challenge for generative social simulation: a critical review of LLMs in agent-based modeling","authors":"Maik Larooij,&nbsp;Petter Törnberg","doi":"10.1007/s10462-025-11412-6","DOIUrl":"10.1007/s10462-025-11412-6","url":null,"abstract":"<div><p>Recent advances in Large Language Models (LLMs) have revitalized interest in Agent-Based Models (ABMs) by enabling “generative” simulations, with agents that can plan, reason, and interact through natural language. These developments promise greater realism and expressive power, but also revive long-standing concerns over empirical grounding, calibration, and validation—issues that have historically limited the uptake of ABMs in the social sciences. This paper systematically reviews the emerging literature on generative ABMs to assess how these long-standing challenges are being addressed. We map domains of application, categorize reported validation practices, and assess their alignment with the stated modeling goals. Our review suggests that the use of LLMs may exacerbate rather than alleviate the challenge of validating ABMs, given their black-box structure, cultural biases, and stochastic outputs. While the need for validation is increasingly acknowledged, studies often rely on face-validity or outcome measures that are only loosely tied to underlying mechanisms. Generative ABMs thus occupy an ambiguous methodological space—lacking both the parsimony of formal models and the empirical validity of data-driven approaches—and their contribution to cumulative social-scientific knowledge hinges on resolving this tension.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11412-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145561098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning powered financial credit scoring: a systematic literature review 机器学习驱动的金融信用评分:系统文献综述
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1007/s10462-025-11416-2
Helmi Ayari, Pr. Ramzi Guetari, Pr. Naoufel Kraïem

Over the past few decades, credit scoring has become an important tool in the financial sector. It enables banks and financial institutions to assess the creditworthiness of individuals and reduce the risk of default. As a result of significant advances in artificial intelligence techniques. Machine learning (ML) has made it possible to improve credit scoring by distinguishing between people with good creditworthiness and those with poorer creditworthiness. In this article, we propose a systematic literature review of ML-based financial credit scoring methods published between 2018 and 2024. A total of 330 research papers were extracted from four different online databases and digital libraries. After the study selection procedure, 63 research papers were selected for this systematic review. This paper aims to identify the major ML methods used in credit scoring, assess their strengths and limitations, and highlight notable trends and advancements. In addition, the review addresses the critical challenges faced in the adoption of ML models for credit scoring. This study not only contributes to the understanding of effective ML techniques used for credit scoring but also guides future research by highlighting the promising avenues in ML-based credit scoring efforts.

在过去的几十年里,信用评分已经成为金融领域的一个重要工具。它使银行和金融机构能够评估个人的信誉并降低违约风险。由于人工智能技术的显著进步。机器学习(ML)可以通过区分信誉良好的人和信誉较差的人来提高信用评分。在本文中,我们对2018年至2024年间发表的基于ml的金融信用评分方法进行了系统的文献综述。共有330篇研究论文从四个不同的在线数据库和数字图书馆中提取。经过研究选择程序,本系统综述共选择63篇研究论文。本文旨在确定用于信用评分的主要机器学习方法,评估其优势和局限性,并强调值得注意的趋势和进步。此外,该审查还解决了采用ML模型进行信用评分所面临的关键挑战。这项研究不仅有助于理解用于信用评分的有效机器学习技术,而且还通过强调基于机器学习的信用评分工作的有前途的途径来指导未来的研究。
{"title":"Machine learning powered financial credit scoring: a systematic literature review","authors":"Helmi Ayari,&nbsp;Pr. Ramzi Guetari,&nbsp;Pr. Naoufel Kraïem","doi":"10.1007/s10462-025-11416-2","DOIUrl":"10.1007/s10462-025-11416-2","url":null,"abstract":"<div><p>Over the past few decades, credit scoring has become an important tool in the financial sector. It enables banks and financial institutions to assess the creditworthiness of individuals and reduce the risk of default. As a result of significant advances in artificial intelligence techniques. Machine learning (ML) has made it possible to improve credit scoring by distinguishing between people with good creditworthiness and those with poorer creditworthiness. In this article, we propose a systematic literature review of ML-based financial credit scoring methods published between 2018 and 2024. A total of 330 research papers were extracted from four different online databases and digital libraries. After the study selection procedure, 63 research papers were selected for this systematic review. This paper aims to identify the major ML methods used in credit scoring, assess their strengths and limitations, and highlight notable trends and advancements. In addition, the review addresses the critical challenges faced in the adoption of ML models for credit scoring. This study not only contributes to the understanding of effective ML techniques used for credit scoring but also guides future research by highlighting the promising avenues in ML-based credit scoring efforts.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11416-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145561091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-based fire management system using autonomous unmanned aerial vehicles: a comprehensive survey 基于视觉的自主无人机火灾管理系统:综合调查
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-18 DOI: 10.1007/s10462-025-11415-3
Sufyan Danish, Md. Jalil Piran, Samee Ullah Khan, Muhammad Attique Khan, L. Minh Dang, Yahya Zweiri, Hyoung-Kyu Song, Hyeonjoon Moon

In recent years, the intensity and frequency of fires have increased significantly, resulting in considerable damage to properties and the environment through wildfires, oil pipeline fires, hazardous gas emissions, and building fires. Effective fire management systems are essential for early detection, rapid response, and mitigation of fire impacts. To address this challenge, unmanned aerial vehicles (UAVs) integrated with advanced state-of-the-art deep learning techniques offer a transformative solution for real-time fire detection, monitoring, and response. As UAVs play an essential role in the detection, classification and segmentation of fire-affected regions, enhancing vision-based fire management through advanced computer vision and deep learning technologies. This comprehensive survey critically examines recent advancements in vision-based fire management systems enabled by autonomous UAVs. It explores how baseline deep learning models, including convolutional neural networks, attention mechanisms, YOLO variants, generative adversarial networks and transformers, enhance UAV capabilities for fire-related tasks. Unlike previous reviews that focus on conventional machine learning and general AI approaches, this survey emphasizes the unique advantages and applications of deep learning-driven UAV platforms in fire scenarios. It provides detailed insights into various architectures, performance and applications used in UAV-based fire management. Additionally, the paper provides detailed insights into the available fire datasets along with their download links and outlines critical challenges, including data imbalance, privacy concerns, and real-time processing limitations. Finally, the survey identifies promising future directions, including multimodal sensor fusion, lightweight neural network architectures optimized for UAV deployment, and vision-language models. By synthesizing current research and identifying future directions, this survey aims to support the development of robust, intelligent UAV-based solutions for next-generation fire management. Researchers and professionals can access the GitHub repository.

近年来,火灾的强度和频率显著增加,通过野火、石油管道火灾、有害气体排放和建筑火灾对财产和环境造成了相当大的破坏。有效的火灾管理系统对于早期发现、快速反应和减轻火灾影响至关重要。为了应对这一挑战,无人机与先进的深度学习技术相结合,为实时火灾探测、监控和响应提供了一种变革性的解决方案。由于无人机在火灾影响区域的检测、分类和分割中发挥着至关重要的作用,因此通过先进的计算机视觉和深度学习技术加强基于视觉的火灾管理。这项全面的调查严格审查了由自主无人机支持的基于视觉的火灾管理系统的最新进展。它探讨了基线深度学习模型,包括卷积神经网络、注意机制、YOLO变体、生成对抗网络和变压器,如何增强无人机执行火灾相关任务的能力。与以往关注传统机器学习和一般人工智能方法的综述不同,本次调查强调了深度学习驱动的无人机平台在五个场景中的独特优势和应用。它提供了对基于无人机的火灾管理中使用的各种架构、性能和应用的详细见解。此外,本文还提供了对现有数据集及其下载链接的详细见解,并概述了关键挑战,包括数据不平衡、隐私问题和实时处理限制。最后,该调查确定了有希望的未来方向,包括多模态传感器融合、针对无人机部署优化的轻量级神经网络架构和视觉语言模型。通过综合目前的研究和确定未来的方向,该调查旨在为下一代火灾管理提供强大的、智能的基于无人机的解决方案。研究人员和专业人士可以访问GitHub存储库。
{"title":"Vision-based fire management system using autonomous unmanned aerial vehicles: a comprehensive survey","authors":"Sufyan Danish,&nbsp;Md. Jalil Piran,&nbsp;Samee Ullah Khan,&nbsp;Muhammad Attique Khan,&nbsp;L. Minh Dang,&nbsp;Yahya Zweiri,&nbsp;Hyoung-Kyu Song,&nbsp;Hyeonjoon Moon","doi":"10.1007/s10462-025-11415-3","DOIUrl":"10.1007/s10462-025-11415-3","url":null,"abstract":"<div><p>In recent years, the intensity and frequency of fires have increased significantly, resulting in considerable damage to properties and the environment through wildfires, oil pipeline fires, hazardous gas emissions, and building fires. Effective fire management systems are essential for early detection, rapid response, and mitigation of fire impacts. To address this challenge, unmanned aerial vehicles (UAVs) integrated with advanced state-of-the-art deep learning techniques offer a transformative solution for real-time fire detection, monitoring, and response. As UAVs play an essential role in the detection, classification and segmentation of fire-affected regions, enhancing vision-based fire management through advanced computer vision and deep learning technologies. This comprehensive survey critically examines recent advancements in vision-based fire management systems enabled by autonomous UAVs. It explores how baseline deep learning models, including convolutional neural networks, attention mechanisms, YOLO variants, generative adversarial networks and transformers, enhance UAV capabilities for fire-related tasks. Unlike previous reviews that focus on conventional machine learning and general AI approaches, this survey emphasizes the unique advantages and applications of deep learning-driven UAV platforms in fire scenarios. It provides detailed insights into various architectures, performance and applications used in UAV-based fire management. Additionally, the paper provides detailed insights into the available fire datasets along with their download links and outlines critical challenges, including data imbalance, privacy concerns, and real-time processing limitations. Finally, the survey identifies promising future directions, including multimodal sensor fusion, lightweight neural network architectures optimized for UAV deployment, and vision-language models. By synthesizing current research and identifying future directions, this survey aims to support the development of robust, intelligent UAV-based solutions for next-generation fire management. Researchers and professionals can access the GitHub repository.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11415-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145561092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Agentic AI: a comprehensive survey of architectures, applications, and future directions 代理AI:对架构、应用和未来方向的全面调查
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-14 DOI: 10.1007/s10462-025-11422-4
Mohamad Abou Ali, Fadi Dornaika, Jinan Charafeddine

Agentic AI represents a transformative shift in artificial intelligence, but its rapid advancement has led to a fragmented understanding, often conflating modern neural systems with outdated symbolic models—a practice known as conceptual retrofitting. This survey cuts through this confusion by introducing a novel dual-paradigm framework that categorizes agentic systems into two distinct lineages: the symbolic/classical (relying on algorithmic planning and persistent state) and the neural/generative (leveraging stochastic generation and prompt-driven orchestration). Through a systematic PRISMA-based review of 90 studies (2018–2025), we provide a comprehensive analysis structured around this framework across three dimensions: (1) the theoretical foundations and architectural principles defining each paradigm; (2) domain-specific implementations in healthcare, finance, and robotics, demonstrating how application constraints dictate paradigm selection; and (3) paradigm-specific ethical and governance challenges, revealing divergent risks and mitigation strategies. Our analysis reveals that the choice of paradigm is strategic: symbolic systems dominate safety-critical domains (e.g., healthcare), while neural systems prevail in adaptive, data-rich environments (e.g., finance). Furthermore, we identify critical research gaps, including a significant deficit in governance models for symbolic systems and a pressing need for hybrid neuro-symbolic architectures. The findings culminate in a strategic roadmap arguing that the future of Agentic AI lies not in the dominance of one paradigm, but in their intentional integration to create systems that are both adaptable and reliable. This work provides the essential conceptual toolkit to guide future research, development, and policy toward robust and trustworthy hybrid intelligent systems.

人工智能代表了人工智能的变革,但它的快速发展导致了对它的理解支离破碎,经常将现代神经系统与过时的符号模型混为一谈——这种做法被称为概念改造。本研究通过引入一种新的双范式框架来消除这种困惑,该框架将代理系统分为两个不同的谱系:符号/经典(依赖于算法规划和持续状态)和神经/生成(利用随机生成和即时驱动的编排)。通过对90项研究(2018-2025)基于prisma的系统回顾,我们围绕这一框架从三个方面进行了全面分析:(1)定义每种范式的理论基础和架构原则;(2)在医疗保健、金融和机器人领域的特定实现,展示了应用约束如何决定范式选择;(3)特定范例的伦理和治理挑战,揭示了不同的风险和缓解策略。我们的分析表明,范式的选择是战略性的:符号系统主导着安全关键领域(例如,医疗保健),而神经系统在自适应、数据丰富的环境(例如,金融)中占主导地位。此外,我们确定了关键的研究空白,包括符号系统治理模型的重大缺陷和对混合神经符号架构的迫切需求。这些发现最终形成了一个战略路线图,认为人工智能的未来不在于一个范例的主导地位,而在于它们的有意整合,以创建既适应性强又可靠的系统。这项工作提供了基本的概念工具包,以指导未来的研究、开发和政策,以实现健壮和可信的混合智能系统。
{"title":"Agentic AI: a comprehensive survey of architectures, applications, and future directions","authors":"Mohamad Abou Ali,&nbsp;Fadi Dornaika,&nbsp;Jinan Charafeddine","doi":"10.1007/s10462-025-11422-4","DOIUrl":"10.1007/s10462-025-11422-4","url":null,"abstract":"<div><p>Agentic AI represents a transformative shift in artificial intelligence, but its rapid advancement has led to a fragmented understanding, often conflating modern neural systems with outdated symbolic models—a practice known as <i>conceptual retrofitting</i>. This survey cuts through this confusion by introducing a novel dual-paradigm framework that categorizes agentic systems into two distinct lineages: the symbolic/classical (relying on algorithmic planning and persistent state) and the neural/generative (leveraging stochastic generation and prompt-driven orchestration). Through a systematic PRISMA-based review of 90 studies (2018–2025), we provide a comprehensive analysis structured around this framework across three dimensions: (1) the theoretical foundations and architectural principles defining each paradigm; (2) domain-specific implementations in healthcare, finance, and robotics, demonstrating how application constraints dictate paradigm selection; and (3) paradigm-specific ethical and governance challenges, revealing divergent risks and mitigation strategies. Our analysis reveals that the choice of paradigm is strategic: symbolic systems dominate safety-critical domains (e.g., healthcare), while neural systems prevail in adaptive, data-rich environments (e.g., finance). Furthermore, we identify critical research gaps, including a significant deficit in governance models for symbolic systems and a pressing need for hybrid neuro-symbolic architectures. The findings culminate in a strategic roadmap arguing that the future of Agentic AI lies not in the dominance of one paradigm, but in their intentional integration to create systems that are both <i>adaptable</i> and <i>reliable</i>. This work provides the essential conceptual toolkit to guide future research, development, and policy toward robust and trustworthy hybrid intelligent systems.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11422-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145511072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding nature’s melody: significance and challenges of machine learning in assessing bird diversity via soundscape analysis 解码自然的旋律:通过声景分析评估鸟类多样性的机器学习的意义和挑战
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-14 DOI: 10.1007/s10462-025-11414-4
Jiangjian Xie, Shanshan Xie, Yang Liu, Xin Jing, Mengkun Zhu, Linlin Xie, Junguo Zhang, Kun Qian, Björn W. Schuller

The broad application of passive acoustic monitoring provides a critical data foundation for studying soundscape ecology, necessitating automated analysis methods to accurately extract ecological information from vast soundscape data. This review comprehensively and cohesively examines two predominant approaches in soundscape analysis: soundscape component recognition and acoustic indices methods. Focusing on machine learning (ML)-based analysis methods for bird diversity assessment over the past five years, this review surveys representative research within each category, outlining their respective strengths and limitations. This not only addresses the growing interest in this field but also identifies research gaps and poses key questions for future studies. The insights from this review are anticipated to significantly enhance the understanding of ML applications in soundscape analysis, guiding subsequent investigative efforts in this rapidly evolving discipline, and thereby better supporting long-term biodiversity monitoring and conservation initiatives.

被动声监测的广泛应用为声景生态研究提供了重要的数据基础,需要自动化分析方法从海量声景数据中准确提取生态信息。本文综述了声景分析的两种主要方法:声景成分识别法和声学指数法。回顾了近五年来基于机器学习(ML)的鸟类多样性评估分析方法,对每个类别的代表性研究进行了调查,并概述了各自的优势和局限性。这不仅解决了人们对这一领域日益增长的兴趣,而且还确定了研究空白,并为未来的研究提出了关键问题。本综述的见解有望显著增强对声景分析中ML应用的理解,指导这一快速发展的学科的后续调查工作,从而更好地支持长期的生物多样性监测和保护举措。
{"title":"Decoding nature’s melody: significance and challenges of machine learning in assessing bird diversity via soundscape analysis","authors":"Jiangjian Xie,&nbsp;Shanshan Xie,&nbsp;Yang Liu,&nbsp;Xin Jing,&nbsp;Mengkun Zhu,&nbsp;Linlin Xie,&nbsp;Junguo Zhang,&nbsp;Kun Qian,&nbsp;Björn W. Schuller","doi":"10.1007/s10462-025-11414-4","DOIUrl":"10.1007/s10462-025-11414-4","url":null,"abstract":"<div><p>The broad application of passive acoustic monitoring provides a critical data foundation for studying soundscape ecology, necessitating automated analysis methods to accurately extract ecological information from vast soundscape data. This review comprehensively and cohesively examines two predominant approaches in soundscape analysis: soundscape component recognition and acoustic indices methods. Focusing on machine learning (ML)-based analysis methods for bird diversity assessment over the past five years, this review surveys representative research within each category, outlining their respective strengths and limitations. This not only addresses the growing interest in this field but also identifies research gaps and poses key questions for future studies. The insights from this review are anticipated to significantly enhance the understanding of ML applications in soundscape analysis, guiding subsequent investigative efforts in this rapidly evolving discipline, and thereby better supporting long-term biodiversity monitoring and conservation initiatives.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11414-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145511071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Artificial Intelligence Review
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1