首页 > 最新文献

Artificial Intelligence Review最新文献

英文 中文
Simulation of teaching behaviours in intelligent tutoring systems: a review using large language models 智能辅导系统中教学行为的模拟:使用大型语言模型的综述
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-13 DOI: 10.1007/s10462-025-11464-8
Luis Marquez-Carpintero, Alberto Lopez-Sellers, Miguel Cazorla

Intelligent Tutoring Systems (ITS) and allied digital platforms now constitute core infrastructure in many classrooms, where they automate formative assessment, personalise pacing and supply fine-grained analytics that would otherwise exceed human capacity. Against this backdrop, Large Language Models (LLMs) have emerged as a disruptive layer of functionality, expanding educational AI from rule-based tutoring to open-ended dialogue, generative content and real-time adaptation. Early classroom prototypes already leverage multi-agent LLM frameworks to orchestrate teacher-student and peer interactions, demonstrating richer discourse patterns and enhanced engagement when benchmarked with established observation rubrics. Most consequential, however, is the accelerating shift towards full simulation of teacher work. Emerging evidence suggests that prompting an LLM to rehearse lessons, generate reflective commentary, and iteratively revise materials can raise the quality of teaching plans to a level comparable to those crafted by expert educators. While the narrative highlights practical applications and pedagogical implications, this review is grounded in a systematic methodology combined with narrative analysis, ensuring analytical depth and thematic cohesion.

智能辅导系统(ITS)和相关的数字平台现在构成了许多教室的核心基础设施,它们可以自动化形成性评估,个性化节奏,并提供精细的分析,否则将超出人类的能力。在这种背景下,大型语言模型(llm)已经成为一个颠覆性的功能层,将教育人工智能从基于规则的辅导扩展到开放式对话、生成内容和实时适应。早期的课堂原型已经利用多代理法学硕士框架来协调师生和同伴之间的互动,展示了更丰富的话语模式,并在建立观察规则的基准下增强了参与度。然而,最重要的是加速向教师工作完全模拟的转变。越来越多的证据表明,促使法学硕士预习课程,产生反思性评论,并反复修改材料,可以将教学计划的质量提高到与专业教育工作者制定的教学计划相当的水平。虽然叙述强调实际应用和教学意义,但本综述以系统的方法为基础,结合叙述分析,确保分析深度和主题衔接。
{"title":"Simulation of teaching behaviours in intelligent tutoring systems: a review using large language models","authors":"Luis Marquez-Carpintero,&nbsp;Alberto Lopez-Sellers,&nbsp;Miguel Cazorla","doi":"10.1007/s10462-025-11464-8","DOIUrl":"10.1007/s10462-025-11464-8","url":null,"abstract":"<div><p>Intelligent Tutoring Systems (ITS) and allied digital platforms now constitute core infrastructure in many classrooms, where they automate formative assessment, personalise pacing and supply fine-grained analytics that would otherwise exceed human capacity. Against this backdrop, Large Language Models (LLMs) have emerged as a disruptive layer of functionality, expanding educational AI from rule-based tutoring to open-ended dialogue, generative content and real-time adaptation. Early classroom prototypes already leverage multi-agent LLM frameworks to orchestrate teacher-student and peer interactions, demonstrating richer discourse patterns and enhanced engagement when benchmarked with established observation rubrics. Most consequential, however, is the accelerating shift towards full simulation of teacher work. Emerging evidence suggests that prompting an LLM to rehearse lessons, generate reflective commentary, and iteratively revise materials can raise the quality of teaching plans to a level comparable to those crafted by expert educators. While the narrative highlights practical applications and pedagogical implications, this review is grounded in a systematic methodology combined with narrative analysis, ensuring analytical depth and thematic cohesion.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 2","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11464-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced pixel-wise style fusion network for stent malapposition recognition with re-parameterizing technique in OCT 基于OCT重参数化技术的增强像素风格融合网络支架错位识别
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-12 DOI: 10.1007/s10462-025-11465-7
Hua Zhang, Chen Zhang, Jing Li, Xuexi Xuan, Mingjie Wang, Bo Yi, Kai Xia, Haiyan Wang, Lei Yin, Xiaoqing Zhang

Percutaneous coronary intervention with stent implantation has become a widely used strategy to treat coronary artery disease. Stent malapposition (SM) may increase the risk of late stent thrombosis due to stent tissue coverage reduction, attracting much attention clinically. Recently, optical coherence tomography (OCT) images have been utilized to visually assess the stent apposition/malapposition. However, automated OCT-based SM recognition has been under-explored previously. Therefore, this paper proposes a novel enhanced pixel-wise style fusion network (EPSF-Net) to recognize SM automatically from OCT images. In the EPSF-Net, considering SM information is subtle, we design a novel enhanced pixel-wise style fusion (EPSF) block, which first applies the pixel-wise style pooling to aggregate pixel-wise style context, then enhances pixel-wise style context with multi-scale learning, and finally fuses enhanced pixel-wise style context via a pixel-wise fusion operator. Moreover, the re-parameterizing technique is utilized to reduce the parameters and computational cost of EPSF at the inference stage. Additionally, considering there is no publicly available OCT dataset for SM recognition, we construct an OCT image dataset of SM, named SM-OCT, to validate the effectiveness of our method, which will be available. The extensive experiments on the SM-OCT dataset show that our proposed EPSF-Net achieves better SM recognition performance than state-of-the-art methods. Additionally, two publicly available OCT datasets are employed to verify the generalization of our method.

经皮冠状动脉介入治疗联合支架植入术已成为目前广泛应用的治疗冠状动脉疾病的策略。支架错置(SM)可能由于支架组织覆盖减少而增加支架后期血栓形成的风险,引起了临床的广泛关注。近年来,光学相干断层扫描(OCT)图像已被用于视觉评估支架的配位/错位。然而,基于oct的自动SM识别在以前还没有得到充分的探索。为此,本文提出了一种新的增强逐像素风格融合网络(EPSF-Net),用于从OCT图像中自动识别SM。在EPSF- net中,考虑到SM信息的微妙性,我们设计了一种新的增强的逐像素风格融合(EPSF)块,该块首先应用逐像素风格池对逐像素风格上下文进行聚合,然后利用多尺度学习对逐像素风格上下文进行增强,最后通过逐像素融合算子对增强的逐像素风格上下文进行融合。此外,利用重参数化技术减少了EPSF推理阶段的参数和计算量。此外,考虑到没有公开可用的用于SM识别的OCT数据集,我们构建了一个SM的OCT图像数据集,命名为SM-OCT,以验证我们的方法的有效性,该方法将可用。在SM- oct数据集上的大量实验表明,我们提出的EPSF-Net比现有的方法具有更好的SM识别性能。此外,两个公开可用的OCT数据集被用来验证我们的方法的泛化。
{"title":"Enhanced pixel-wise style fusion network for stent malapposition recognition with re-parameterizing technique in OCT","authors":"Hua Zhang,&nbsp;Chen Zhang,&nbsp;Jing Li,&nbsp;Xuexi Xuan,&nbsp;Mingjie Wang,&nbsp;Bo Yi,&nbsp;Kai Xia,&nbsp;Haiyan Wang,&nbsp;Lei Yin,&nbsp;Xiaoqing Zhang","doi":"10.1007/s10462-025-11465-7","DOIUrl":"10.1007/s10462-025-11465-7","url":null,"abstract":"<div><p>Percutaneous coronary intervention with stent implantation has become a widely used strategy to treat coronary artery disease. Stent malapposition (SM) may increase the risk of late stent thrombosis due to stent tissue coverage reduction, attracting much attention clinically. Recently, optical coherence tomography (OCT) images have been utilized to visually assess the stent apposition/malapposition. However, automated OCT-based SM recognition has been under-explored previously. Therefore, this paper proposes a novel enhanced pixel-wise style fusion network (EPSF-Net) to recognize SM automatically from OCT images. In the EPSF-Net, considering SM information is subtle, we design a novel enhanced pixel-wise style fusion (EPSF) block, which first applies the pixel-wise style pooling to aggregate pixel-wise style context, then enhances pixel-wise style context with multi-scale learning, and finally fuses enhanced pixel-wise style context via a pixel-wise fusion operator. Moreover, the re-parameterizing technique is utilized to reduce the parameters and computational cost of EPSF at the inference stage. Additionally, considering there is no publicly available OCT dataset for SM recognition, we construct an OCT image dataset of SM, named SM-OCT, to validate the effectiveness of our method, which will be available. The extensive experiments on the SM-OCT dataset show that our proposed EPSF-Net achieves better SM recognition performance than state-of-the-art methods. Additionally, two publicly available OCT datasets are employed to verify the generalization of our method.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 2","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11465-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concept drift detection in image data stream: a survey on current literature, limitations and future directions 图像数据流中的概念漂移检测:现有文献综述、局限性及未来发展方向
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-11 DOI: 10.1007/s10462-025-11428-y
Quang-Tien Tran, Nhien-An Le-Khac, Michela Bertolotto

Concept drift—changes in the underlying data distribution over time—poses a significant challenge to machine learning systems deployed in dynamic environments. While numerous drift detection methods have been developed for structured data such as tabular and time-series streams, concept drift in image data remains an underexplored area due to the unstructured and high-dimensional nature of visual information. This survey presents the first comprehensive review of concept drift detection methods tailored for image data streams. We propose a novel taxonomy that categorizes existing approaches based on key properties such as image feature handling, detection strategy, detection level, concept drift cause, and evaluation considerations. Through the lens of this taxonomy, we analyze 14 representative concept drift detection methods designed for image data, highlighting current approaches to the field, their strengths and limitations. Based on this analysis, we outline promising future research directions to advance the field of concept drift detection in image-based systems.

概念漂移——底层数据分布随时间的变化——对部署在动态环境中的机器学习系统构成了重大挑战。虽然已经开发了许多用于结构化数据(如表格和时间序列流)的漂移检测方法,但由于视觉信息的非结构化和高维性质,图像数据中的概念漂移仍然是一个未开发的领域。本调查提出了概念漂移检测方法量身定制的图像数据流的第一个全面审查。我们提出了一种新的分类法,该分类法基于图像特征处理、检测策略、检测级别、概念漂移原因和评估考虑等关键属性对现有方法进行分类。通过这一分类,我们分析了14种为图像数据设计的代表性概念漂移检测方法,突出了该领域目前的方法,它们的优势和局限性。基于此分析,我们概述了未来有希望的研究方向,以推进基于图像的系统中的概念漂移检测领域。
{"title":"Concept drift detection in image data stream: a survey on current literature, limitations and future directions","authors":"Quang-Tien Tran,&nbsp;Nhien-An Le-Khac,&nbsp;Michela Bertolotto","doi":"10.1007/s10462-025-11428-y","DOIUrl":"10.1007/s10462-025-11428-y","url":null,"abstract":"<div><p>Concept drift—changes in the underlying data distribution over time—poses a significant challenge to machine learning systems deployed in dynamic environments. While numerous drift detection methods have been developed for structured data such as tabular and time-series streams, concept drift in image data remains an underexplored area due to the unstructured and high-dimensional nature of visual information. This survey presents the first comprehensive review of concept drift detection methods tailored for image data streams. We propose a novel taxonomy that categorizes existing approaches based on key properties such as image feature handling, detection strategy, detection level, concept drift cause, and evaluation considerations. Through the lens of this taxonomy, we analyze 14 representative concept drift detection methods designed for image data, highlighting current approaches to the field, their strengths and limitations. Based on this analysis, we outline promising future research directions to advance the field of concept drift detection in image-based systems.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11428-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text data augmentation for large language models: a comprehensive survey of methods, challenges, and opportunities 大型语言模型的文本数据增强:对方法、挑战和机遇的全面调查
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-11 DOI: 10.1007/s10462-025-11405-5
Yaping Chai, Haoran Xie, Joe S. Qin

The increasing size and complexity of pre-trained language models have demonstrated superior performance in many applications, but they usually require large training datasets to be adequately trained. Insufficient training sets could unexpectedly lead the model to overfit and fail to cope with complex tasks. Large language models (LLMs) trained on extensive corpora have prominent text generation capabilities, which improve the quality and quantity of data and play a crucial role in data augmentation. Specifically, distinctive prompt templates are given in personalised tasks to guide LLMs in generating the required content. Recently, promising retrieval-based techniques have further enhanced the expressive performance of LLMs in data augmentation by introducing external knowledge, enabling them to produce more grounded data. This survey provides an in-depth analysis of data augmentation in LLMs, classifying the techniques into Simple Augmentation, Prompt-based Augmentation, Retrieval-based Augmentation, and Hybrid Augmentation. Additionally, we conduct extensive experiments across four techniques, systematically compare and analyse their performance, and provide key insights. Following this, we connect data augmentation with three critical optimisation techniques. Finally, we introduce existing challenges and future opportunities that could further improve data augmentation. This survey provides researchers and practitioners of the text modality with avenues to address data scarcity and improve data quality, helping scholars understand the evolution of text data augmentation from traditional methods to the application of human-like generation and agent search in the era of LLMs.

预训练语言模型的规模和复杂性不断增加,在许多应用中表现出了卓越的性能,但它们通常需要大量的训练数据集才能得到充分的训练。训练集不足会意外地导致模型过拟合,无法处理复杂的任务。在广泛的语料库上训练的大型语言模型(llm)具有突出的文本生成能力,提高了数据的质量和数量,在数据扩充中起着至关重要的作用。具体而言,在个性化任务中给出了独特的提示模板,以指导法学硕士生成所需的内容。最近,有前途的基于检索的技术通过引入外部知识进一步增强了llm在数据增强方面的表达性能,使它们能够产生更基础的数据。本文对法学硕士中的数据增强技术进行了深入的分析,并将其分为简单增强、基于提示的增强、基于检索的增强和混合增强。此外,我们对四种技术进行了广泛的实验,系统地比较和分析了它们的性能,并提供了关键的见解。接下来,我们将数据增强与三种关键的优化技术联系起来。最后,我们介绍了可以进一步改进数据增强的现有挑战和未来机遇。本调查为文本模态的研究人员和实践者提供了解决数据稀缺和提高数据质量的途径,帮助学者了解文本数据增强在法学硕士时代从传统方法到类人生成和智能体搜索的应用的演变。
{"title":"Text data augmentation for large language models: a comprehensive survey of methods, challenges, and opportunities","authors":"Yaping Chai,&nbsp;Haoran Xie,&nbsp;Joe S. Qin","doi":"10.1007/s10462-025-11405-5","DOIUrl":"10.1007/s10462-025-11405-5","url":null,"abstract":"<div><p>The increasing size and complexity of pre-trained language models have demonstrated superior performance in many applications, but they usually require large training datasets to be adequately trained. Insufficient training sets could unexpectedly lead the model to overfit and fail to cope with complex tasks. Large language models (LLMs) trained on extensive corpora have prominent text generation capabilities, which improve the quality and quantity of data and play a crucial role in data augmentation. Specifically, distinctive prompt templates are given in personalised tasks to guide LLMs in generating the required content. Recently, promising retrieval-based techniques have further enhanced the expressive performance of LLMs in data augmentation by introducing external knowledge, enabling them to produce more grounded data. This survey provides an in-depth analysis of data augmentation in LLMs, classifying the techniques into Simple Augmentation, Prompt-based Augmentation, Retrieval-based Augmentation, and Hybrid Augmentation. Additionally, we conduct extensive experiments across four techniques, systematically compare and analyse their performance, and provide key insights. Following this, we connect data augmentation with three critical optimisation techniques. Finally, we introduce existing challenges and future opportunities that could further improve data augmentation. This survey provides researchers and practitioners of the text modality with avenues to address data scarcity and improve data quality, helping scholars understand the evolution of text data augmentation from traditional methods to the application of human-like generation and agent search in the era of LLMs.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11405-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement learning and the Metaverse: a symbiotic collaboration 强化学习和虚拟世界:一种共生合作
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-11 DOI: 10.1007/s10462-025-11433-1
Nada Elsokkary, Wasif Khan, Mohammed Shurrab, Rabeb Mizouni, Shakti Singh, Jamal Bentahar, Azzam Mourad, Hadi Otrok

The Metaverse is an emerging virtual reality space that merges digital and physical worlds and provides users with immersive, interactive, and persistent virtual environments. The Metaverse leverages multiple technologies, including digital twins, blockchain, artificial intelligence, extended reality, and edge computing to realize the seamless connectivity and interaction between both worlds: physical and virtual. Artificial Intelligence (AI) empowers intelligent decisions in such complex dynamic environments. More specifically, Reinforcement Learning (RL) is uniquely effective in the context of Metaverse applications due to the natural process of learning through interaction and its modeling of sequential decision making, allowing it to be flexible, dynamic, and able to discover complex strategies and emergent behavior in complicated environments where programming explicit rules is impractical. Although multiple works have explored the research on the Metaverse and AI-based applications, there remains a significant gap in the literature that addresses the contribution of RL algorithms within the Metaverse. Therefore, this review presents a comprehensive overview of RL algorithms for Metaverse applications. We examine the architecture of Metaverse networks, the role of RL in enhancing virtual interactions, and the potential for transferring learned behaviors to real-world applications. Furthermore, we categorize the key challenges, opportunities, and research directions associated with deploying RL in the Metaverse.

虚拟世界是一个新兴的虚拟现实空间,它融合了数字世界和物理世界,为用户提供身临其境、互动和持久的虚拟环境。虚拟世界利用多种技术,包括数字双胞胎、区块链、人工智能、扩展现实和边缘计算,实现物理和虚拟世界之间的无缝连接和交互。人工智能(AI)可以在这种复杂的动态环境中实现智能决策。更具体地说,强化学习(RL)在元宇宙应用程序的背景下是独特有效的,因为通过交互和序列决策建模的自然学习过程,使其具有灵活性,动态性,并且能够在编程显式规则不切实际的复杂环境中发现复杂的策略和紧急行为。尽管已有多篇文章探讨了对元宇宙和基于人工智能的应用程序的研究,但在论述RL算法在元宇宙中的贡献方面,文献中仍然存在很大的空白。因此,本文综述了用于元宇宙应用的RL算法的全面概述。我们研究了虚拟世界网络的架构,强化学习在增强虚拟交互中的作用,以及将学习行为转移到现实世界应用的潜力。此外,我们对与在元宇宙中部署RL相关的关键挑战、机遇和研究方向进行了分类。
{"title":"Reinforcement learning and the Metaverse: a symbiotic collaboration","authors":"Nada Elsokkary,&nbsp;Wasif Khan,&nbsp;Mohammed Shurrab,&nbsp;Rabeb Mizouni,&nbsp;Shakti Singh,&nbsp;Jamal Bentahar,&nbsp;Azzam Mourad,&nbsp;Hadi Otrok","doi":"10.1007/s10462-025-11433-1","DOIUrl":"10.1007/s10462-025-11433-1","url":null,"abstract":"<div><p>The Metaverse is an emerging virtual reality space that merges digital and physical worlds and provides users with immersive, interactive, and persistent virtual environments. The Metaverse leverages multiple technologies, including digital twins, blockchain, artificial intelligence, extended reality, and edge computing to realize the seamless connectivity and interaction between both worlds: physical and virtual. Artificial Intelligence (AI) empowers intelligent decisions in such complex dynamic environments. More specifically, Reinforcement Learning (RL) is uniquely effective in the context of Metaverse applications due to the natural process of learning through interaction and its modeling of sequential decision making, allowing it to be flexible, dynamic, and able to discover complex strategies and emergent behavior in complicated environments where programming explicit rules is impractical. Although multiple works have explored the research on the Metaverse and AI-based applications, there remains a significant gap in the literature that addresses the contribution of RL algorithms within the Metaverse. Therefore, this review presents a comprehensive overview of RL algorithms for Metaverse applications. We examine the architecture of Metaverse networks, the role of RL in enhancing virtual interactions, and the potential for transferring learned behaviors to real-world applications. Furthermore, we categorize the key challenges, opportunities, and research directions associated with deploying RL in the Metaverse.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11433-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145730012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flight into the future: a holistic review of AI-trends, vision, and challenges in drones technology 飞行进入未来:全面回顾无人机技术的人工智能趋势、愿景和挑战
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-11 DOI: 10.1007/s10462-025-11449-7
Shakeel Ahmad, Rahiel Ahmad, Ahmad Sami Al-Shamayleh, Divya Nimma, Muhammad Zaman, Nikola Ivković, Korhan Cengiz, Adnan Akhunzada, Ehtisham Haider

The use of Artificial Intelligence (AI) and Unmanned Aerial Vehicles (UAVs), also known as drones, is changing the way future communication and networking systems are designed. UAVs can collect data, support wireless networks, and help deliver services from the sky, which makes them an important part of modern technology. To understand these developments, we reviewed almost 250 research papers published between 2015 and 2024. Our review focuses on UAV network design, communication methods, energy management, AI-based optimization, and future challenges. Unlike previous surveys that mainly summarize individual technical domains, this work introduces a new AI-driven UAV classification framework that connects these aspects under one structure. The framework organizes UAV systems across five dimensions–mission adaptability, autonomy level, communication intelligence, scalability, and deployment context–providing a unified way to compare current and future UAV technologies. This analytical structure highlights how artificial intelligence enables UAVs to move from static, pre-defined operations toward dynamic, real-time decision-making and mission-specific adaptation. We found that deep learning and reinforcement learning are the most common AI methods used to improve routing, flight planning, resource use, and network performance. These techniques help UAV networks adapt to changing conditions and reduce communication delays. However, we also found several open challenges, such as improving real-time energy efficiency, increasing security and privacy, managing large drone groups (swarms), and dealing with regulatory and policy issues. By combining this new framework with an extensive literature review, the paper offers a holistic view that not only summarizes past progress but also maps existing gaps and trends for future research. This paper provides a clear summary of current research, explains key trends, and points out gaps such as the need for lightweight AI models and better swarm coordination. The insights from this review can help researchers and engineers build smarter, safer, and more efficient UAV networks in the future.

人工智能(AI)和无人驾驶飞行器(uav)的使用正在改变未来通信和网络系统的设计方式。无人机可以收集数据,支持无线网络,并帮助从空中提供服务,这使它们成为现代技术的重要组成部分。为了了解这些发展,我们回顾了2015年至2024年间发表的近250篇研究论文。我们的综述集中在无人机网络设计、通信方法、能源管理、基于人工智能的优化以及未来的挑战。与以往主要总结单个技术领域的调查不同,这项工作引入了一种新的人工智能驱动的无人机分类框架,将这些方面连接在一个结构下。该框架跨五个维度组织无人机系统——任务适应性、自治水平、通信智能、可扩展性和部署环境——提供了一种统一的方法来比较当前和未来的无人机技术。这种分析结构强调了人工智能如何使无人机从静态、预定义的操作转向动态、实时决策和特定任务的适应。我们发现,深度学习和强化学习是最常用的人工智能方法,用于改进路线、飞行计划、资源使用和网络性能。这些技术帮助无人机网络适应不断变化的条件并减少通信延迟。然而,我们也发现了一些开放的挑战,如提高实时能源效率,增加安全性和隐私,管理大型无人机群体(蜂群),以及处理监管和政策问题。通过将这一新框架与广泛的文献综述相结合,本文提供了一个整体的观点,不仅总结了过去的进展,而且为未来的研究绘制了现有的差距和趋势。本文对当前的研究进行了清晰的总结,解释了关键趋势,并指出了诸如对轻量级AI模型和更好的群体协调的需求等差距。这篇综述的见解可以帮助研究人员和工程师在未来建立更智能、更安全、更高效的无人机网络。
{"title":"Flight into the future: a holistic review of AI-trends, vision, and challenges in drones technology","authors":"Shakeel Ahmad,&nbsp;Rahiel Ahmad,&nbsp;Ahmad Sami Al-Shamayleh,&nbsp;Divya Nimma,&nbsp;Muhammad Zaman,&nbsp;Nikola Ivković,&nbsp;Korhan Cengiz,&nbsp;Adnan Akhunzada,&nbsp;Ehtisham Haider","doi":"10.1007/s10462-025-11449-7","DOIUrl":"10.1007/s10462-025-11449-7","url":null,"abstract":"<div><p>The use of Artificial Intelligence (AI) and Unmanned Aerial Vehicles (UAVs), also known as drones, is changing the way future communication and networking systems are designed. UAVs can collect data, support wireless networks, and help deliver services from the sky, which makes them an important part of modern technology. To understand these developments, we reviewed almost 250 research papers published between 2015 and 2024. Our review focuses on UAV network design, communication methods, energy management, AI-based optimization, and future challenges. Unlike previous surveys that mainly summarize individual technical domains, this work introduces a new AI-driven UAV classification framework that connects these aspects under one structure. The framework organizes UAV systems across five dimensions–mission adaptability, autonomy level, communication intelligence, scalability, and deployment context–providing a unified way to compare current and future UAV technologies. This analytical structure highlights how artificial intelligence enables UAVs to move from static, pre-defined operations toward dynamic, real-time decision-making and mission-specific adaptation. We found that deep learning and reinforcement learning are the most common AI methods used to improve routing, flight planning, resource use, and network performance. These techniques help UAV networks adapt to changing conditions and reduce communication delays. However, we also found several open challenges, such as improving real-time energy efficiency, increasing security and privacy, managing large drone groups (swarms), and dealing with regulatory and policy issues. By combining this new framework with an extensive literature review, the paper offers a holistic view that not only summarizes past progress but also maps existing gaps and trends for future research. This paper provides a clear summary of current research, explains key trends, and points out gaps such as the need for lightweight AI models and better swarm coordination. The insights from this review can help researchers and engineers build smarter, safer, and more efficient UAV networks in the future.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 2","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11449-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Review of generative AI for synthetic data generation: a healthcare perspective 合成数据生成的生成式人工智能综述:医疗保健视角
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-10 DOI: 10.1007/s10462-025-11440-2
Hafiz Muhammad Waseem, Saif Ul Islam, Nikolaos Matragkas, Gregory Epiphaniou, Theodoros N. Arvanitis, Carsten Maple

Generative AI has emerged as a transformative technology in healthcare, enabling the generation of high-fidelity synthetic data for applications such as medical imaging, electronic health records, biomedical signal processing, and drug discovery. The increasing reliance on machine learning in healthcare necessitates large-scale, high-quality datasets, yet real-world data acquisition is often constrained by privacy regulations, heterogeneity, and limited accessibility. Generative AI models provide a viable solution by generating realistic and diverse synthetic datasets while preserving patient confidentiality. Unlike prior reviews that primarily focus on specific model classes or applications, this study fills a significant research gap by offering a unified, comparative evaluation of diverse generative models, including Generative Adversarial Networks, Variational Autoencoders, Transformers, and Diffusion Models, as well as their adaptations for privacy-preserving Federated Learning environments. Each model class is examined in terms of its variants, underlying methodologies, performance in healthcare applications, strengths, limitations, and computational feasibility. The study also investigates practical considerations for deploying generative AI in clinical settings, including challenges related to training stability, bias mitigation, model interpretability, and regulatory compliance. The insights from this review provide guidance for researchers and healthcare practitioners in selecting and optimizing generative AI models for medical applications, laying the foundation for future advancements in AI-driven healthcare solutions.

生成式人工智能已成为医疗保健领域的一项变革性技术,可为医学成像、电子健康记录、生物医学信号处理和药物发现等应用生成高保真合成数据。医疗保健领域对机器学习的依赖日益增加,需要大规模、高质量的数据集,但现实世界的数据采集通常受到隐私法规、异质性和有限的可访问性的限制。生成式人工智能模型通过生成真实和多样化的合成数据集,同时保护患者的机密性,提供了可行的解决方案。与之前主要关注特定模型类或应用的评论不同,本研究通过提供不同生成模型的统一比较评估,填补了一个重要的研究空白,包括生成对抗网络、变分自编码器、变形器和扩散模型,以及它们对保护隐私的联邦学习环境的适应。每个模型类都从其变体、基础方法、医疗保健应用程序中的性能、优势、限制和计算可行性等方面进行检查。该研究还调查了在临床环境中部署生成式人工智能的实际考虑因素,包括与训练稳定性、减少偏见、模型可解释性和法规遵从性相关的挑战。本综述的见解为研究人员和医疗从业人员选择和优化用于医疗应用的生成式人工智能模型提供了指导,为人工智能驱动的医疗解决方案的未来发展奠定了基础。
{"title":"Review of generative AI for synthetic data generation: a healthcare perspective","authors":"Hafiz Muhammad Waseem,&nbsp;Saif Ul Islam,&nbsp;Nikolaos Matragkas,&nbsp;Gregory Epiphaniou,&nbsp;Theodoros N. Arvanitis,&nbsp;Carsten Maple","doi":"10.1007/s10462-025-11440-2","DOIUrl":"10.1007/s10462-025-11440-2","url":null,"abstract":"<div><p>Generative AI has emerged as a transformative technology in healthcare, enabling the generation of high-fidelity synthetic data for applications such as medical imaging, electronic health records, biomedical signal processing, and drug discovery. The increasing reliance on machine learning in healthcare necessitates large-scale, high-quality datasets, yet real-world data acquisition is often constrained by privacy regulations, heterogeneity, and limited accessibility. Generative AI models provide a viable solution by generating realistic and diverse synthetic datasets while preserving patient confidentiality. Unlike prior reviews that primarily focus on specific model classes or applications, this study fills a significant research gap by offering a unified, comparative evaluation of diverse generative models, including Generative Adversarial Networks, Variational Autoencoders, Transformers, and Diffusion Models, as well as their adaptations for privacy-preserving Federated Learning environments. Each model class is examined in terms of its variants, underlying methodologies, performance in healthcare applications, strengths, limitations, and computational feasibility. The study also investigates practical considerations for deploying generative AI in clinical settings, including challenges related to training stability, bias mitigation, model interpretability, and regulatory compliance. The insights from this review provide guidance for researchers and healthcare practitioners in selecting and optimizing generative AI models for medical applications, laying the foundation for future advancements in AI-driven healthcare solutions.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 2","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11440-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Thermohydraulic performance of spray cooling systems: a general model by machine learning 喷雾冷却系统的热水力性能:基于机器学习的通用模型
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-09 DOI: 10.1007/s10462-025-11446-w
Mohammad Shamsodini Lori, Wenge Huang, Zhenhua Tian, Jiangtao Cheng

With the rapid development of high-power density instruments, spray cooling has drawn increasing interest in industry as a high-efficiency thermal management technology. Despite extensive research, typical spray cooling systems only function effectively within constrained conditions. Therefore, creating a general model for spray cooling is essential for accurately predicting its functionalities. In this work, we employed six machine learning (ML) algorithms to analyze the thermal performance and hydraulic properties of spray cooling. Leveraging data from 25 previous studies encompassing different working fluids, spray atomization types, Reynolds numbers ((:Re)), Nusselt numbers (:left(Nuright)), and Weber numbers ((:We)), our ML models significantly enhance the prediction of spray cooling functionalities compared to traditional correlations. The effectiveness of these ML algorithms was experimentally validated, yielding mean absolute percentage errors (MAPEs) of 6(% -)20(% -) for (:Nu) and 4(% -)16(% -) for mean droplet diameter (:{d}_{d}), respectively. Then, we proposed a general correlation for the thermal performances of various working fluids, atomization methods, and operational conditions, achieving a 38% reduction in MAPE compared to the most accurate existing correlation. Subsequently, this general correlation was integrated into the ML models, resulting in MAPEs ranging from 0.48% to 2.3%. Furthermore, we optimized the key factors of spray cooling with the (:Nu) number reaching 220. Finally, we employed SHapley Additive exPlanations (SHAP) approach to interpret the ML models and to identify an optimal strategy towards greatly enhanced thermal performance. This study demonstrates that ML significantly outperforms the empirical correlations for evaluating spray cooling performance and functionalities, paving a new avenue for thermoregulation of modern power systems.

随着高功率密度仪器的快速发展,喷雾冷却作为一种高效的热管理技术越来越受到工业领域的关注。尽管进行了广泛的研究,但典型的喷雾冷却系统仅在受限条件下有效运行。因此,建立喷雾冷却的通用模型对于准确预测其功能至关重要。在这项工作中,我们采用了六种机器学习(ML)算法来分析喷雾冷却的热性能和水力性能。利用先前25项研究的数据,包括不同的工作流体、喷雾雾化类型、雷诺数((:Re))、努塞尔数(:left(Nuright))和韦伯数((:We)),我们的ML模型与传统相关性相比,显著增强了喷雾冷却功能的预测。实验验证了这些ML算法的有效性,平均绝对百分比误差(mape)分别为(:Nu)的6 (% -) 20 (% -)和平均液滴直径(:{d}_{d})的4 (% -) 16 (% -)。然后,我们提出了各种工作流体,雾化方法和操作条件的热性能的一般相关性,获得了38% reduction in MAPE compared to the most accurate existing correlation. Subsequently, this general correlation was integrated into the ML models, resulting in MAPEs ranging from 0.48% to 2.3%. Furthermore, we optimized the key factors of spray cooling with the (:Nu) number reaching 220. Finally, we employed SHapley Additive exPlanations (SHAP) approach to interpret the ML models and to identify an optimal strategy towards greatly enhanced thermal performance. This study demonstrates that ML significantly outperforms the empirical correlations for evaluating spray cooling performance and functionalities, paving a new avenue for thermoregulation of modern power systems.
{"title":"Thermohydraulic performance of spray cooling systems: a general model by machine learning","authors":"Mohammad Shamsodini Lori,&nbsp;Wenge Huang,&nbsp;Zhenhua Tian,&nbsp;Jiangtao Cheng","doi":"10.1007/s10462-025-11446-w","DOIUrl":"10.1007/s10462-025-11446-w","url":null,"abstract":"<div><p>With the rapid development of high-power density instruments, spray cooling has drawn increasing interest in industry as a high-efficiency thermal management technology. Despite extensive research, typical spray cooling systems only function effectively within constrained conditions. Therefore, creating a general model for spray cooling is essential for accurately predicting its functionalities. In this work, we employed six machine learning (ML) algorithms to analyze the thermal performance and hydraulic properties of spray cooling. Leveraging data from 25 previous studies encompassing different working fluids, spray atomization types, Reynolds numbers (<span>(:Re)</span>), Nusselt numbers <span>(:left(Nuright))</span>, and Weber numbers (<span>(:We)</span>), our ML models significantly enhance the prediction of spray cooling functionalities compared to traditional correlations. The effectiveness of these ML algorithms was experimentally validated, yielding mean absolute percentage errors (MAPEs) of 6<span>(% -)</span>20<span>(% -)</span> for <span>(:Nu)</span> and 4<span>(% -)</span>16<span>(% -)</span> for mean droplet diameter <span>(:{d}_{d})</span>, respectively. Then, we proposed a general correlation for the thermal performances of various working fluids, atomization methods, and operational conditions, achieving a 38% reduction in MAPE compared to the most accurate existing correlation. Subsequently, this general correlation was integrated into the ML models, resulting in MAPEs ranging from 0.48% to 2.3%. Furthermore, we optimized the key factors of spray cooling with the <span>(:Nu)</span> number reaching 220. Finally, we employed SHapley Additive exPlanations (SHAP) approach to interpret the ML models and to identify an optimal strategy towards greatly enhanced thermal performance. This study demonstrates that ML significantly outperforms the empirical correlations for evaluating spray cooling performance and functionalities, paving a new avenue for thermoregulation of modern power systems.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11446-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive survey on pose estimation and tracking in sports: methodologies, datasets, challenges, and future directions 运动中姿态估计和跟踪的综合调查:方法、数据集、挑战和未来方向
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-09 DOI: 10.1007/s10462-025-11344-1
Mustafa Hikmet Bilgehan Uçar, Serdar Solak, Ali Olow Jimale, Hakan Ünal, Süleyman Eken

Pose estimation and tracking in sports has gained significant attention due to its potential to revolutionize performance analysis, injury prevention, and strategic decision-making. This survey presents a comprehensive overview of the methodologies, datasets, challenges, and future directions in this rapidly evolving field. We explore traditional approaches, including geometric and statistical models, and highlight the transformative impact of deep learning techniques, such as convolutional neural networks, transformers, and hybrid architectures, which have enabled highly accurate and robust pose estimation. The paper also discusses dataset creation and ground-truthing techniques tailored to sports contexts, emphasizing the importance of multimodal data, scalability, and representativeness. Applications across diverse sports, from individual to team-based activities, demonstrate the versatility of pose estimation systems in both real-time and offline settings. However, challenges such as occlusions, dynamic backgrounds, and computational efficiency persist, necessitating further innovation. We identify future research directions, including the integration of multimodal data, edge computing, and ethical considerations, to enhance accuracy, interpretability, and generalizability. This survey aims to provide a foundational reference for researchers and practitioners, fostering advancements in pose estimation and tracking technologies that meet the unique demands of sports analytics.

运动中的姿势估计和跟踪由于其革命性的性能分析、伤害预防和战略决策的潜力而获得了极大的关注。本调查全面概述了这个快速发展领域的方法、数据集、挑战和未来方向。我们探索了传统的方法,包括几何和统计模型,并强调了深度学习技术的变革性影响,如卷积神经网络、变压器和混合架构,这些技术实现了高度准确和鲁棒的姿态估计。本文还讨论了针对体育背景的数据集创建和地面真相技术,强调了多模态数据、可扩展性和代表性的重要性。从个人到团队的各种体育活动的应用,展示了姿态估计系统在实时和离线设置中的多功能性。然而,诸如遮挡、动态背景和计算效率等挑战仍然存在,需要进一步的创新。我们确定了未来的研究方向,包括多模态数据的集成、边缘计算和伦理考虑,以提高准确性、可解释性和概括性。本调查旨在为研究人员和实践者提供基础参考,促进姿态估计和跟踪技术的进步,以满足体育分析的独特需求。
{"title":"A comprehensive survey on pose estimation and tracking in sports: methodologies, datasets, challenges, and future directions","authors":"Mustafa Hikmet Bilgehan Uçar,&nbsp;Serdar Solak,&nbsp;Ali Olow Jimale,&nbsp;Hakan Ünal,&nbsp;Süleyman Eken","doi":"10.1007/s10462-025-11344-1","DOIUrl":"10.1007/s10462-025-11344-1","url":null,"abstract":"<div><p>Pose estimation and tracking in sports has gained significant attention due to its potential to revolutionize performance analysis, injury prevention, and strategic decision-making. This survey presents a comprehensive overview of the methodologies, datasets, challenges, and future directions in this rapidly evolving field. We explore traditional approaches, including geometric and statistical models, and highlight the transformative impact of deep learning techniques, such as convolutional neural networks, transformers, and hybrid architectures, which have enabled highly accurate and robust pose estimation. The paper also discusses dataset creation and ground-truthing techniques tailored to sports contexts, emphasizing the importance of multimodal data, scalability, and representativeness. Applications across diverse sports, from individual to team-based activities, demonstrate the versatility of pose estimation systems in both real-time and offline settings. However, challenges such as occlusions, dynamic backgrounds, and computational efficiency persist, necessitating further innovation. We identify future research directions, including the integration of multimodal data, edge computing, and ethical considerations, to enhance accuracy, interpretability, and generalizability. This survey aims to provide a foundational reference for researchers and practitioners, fostering advancements in pose estimation and tracking technologies that meet the unique demands of sports analytics.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 2","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11344-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145930719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph network learning for human skeleton modeling: a survey 图网络学习用于人体骨骼建模:综述
IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-12-09 DOI: 10.1007/s10462-025-11442-0
Xi Yang, Shaoyi Li, Saisai Niu, Xiaokui Yue

Over the past few decades, Human Skeleton Modeling (HSM) has gained considerable attention in computer vision, exploring various practical applications such as the video surveillance, the human-computer interaction, the medical assistance analysis, and the autonomous driving through images and videos. The performance of HSM and its applications on challenging datasets has been significantly improved due to recent advancements of deep learning methods. These advancements have been extended to non-Euclidean or graph data with multiple nodes and edges. Because human joints and skeleton combinations are represented as graph structures, graph networks are appropriate for the non-Euclidean HSM. In recent years, graph networks have become essential tools for the HSM and behavioral analyses. However, prior surveys are often siloed, focusing either on a narrow class of models such as GCNs or on a single application like action recognition. A unified framework that systematically analyzes diverse graph network learning paradigms across the entire HSM pipeline has been notably absent. We conduct a survey of graph network methods for HSM and their application domains. This comprehensive overview includes a taxonomy of graph network techniques, a detailed study of benchmark datasets for HSM, extensive descriptions of the performance of graph networks in three major application domains, and a collection of related resources and open-source codes. Finally, we provided insightful recommendations for future research directions and trends of graph networks for HSM. This survey serves as the introductory material for beginners in graph network-based HSM and as the reference materials for advanced researchers.

在过去的几十年里,人体骨骼建模(HSM)在计算机视觉领域得到了相当大的关注,通过图像和视频探索了视频监控、人机交互、医疗辅助分析、自动驾驶等各种实际应用。由于深度学习方法的最新进展,HSM的性能及其在具有挑战性的数据集上的应用已经得到了显着改善。这些进步已经扩展到具有多个节点和边的非欧几里德或图形数据。由于人体关节和骨骼组合是用图结构表示的,因此图网络适用于非欧几里得高速切削。近年来,图网络已成为HSM和行为分析的重要工具。然而,之前的调查往往是孤立的,要么集中在一个狭窄的模型类别上,比如GCNs,要么集中在一个单一的应用程序上,比如动作识别。一个统一的框架,系统地分析不同的图网络学习范式横跨整个高速加工管道已经明显缺乏。本文对高速切削的图网络方法及其应用领域进行了综述。这篇全面的概述包括图网络技术的分类、对HSM基准数据集的详细研究、对图网络在三个主要应用领域的性能的广泛描述,以及相关资源和开源代码的集合。最后,对未来高速切削图网络的研究方向和发展趋势提出了有见地的建议。本调查为基于图网络的高速切削初学者提供了入门材料,也为高级研究人员提供了参考材料。
{"title":"Graph network learning for human skeleton modeling: a survey","authors":"Xi Yang,&nbsp;Shaoyi Li,&nbsp;Saisai Niu,&nbsp;Xiaokui Yue","doi":"10.1007/s10462-025-11442-0","DOIUrl":"10.1007/s10462-025-11442-0","url":null,"abstract":"<div><p>Over the past few decades, Human Skeleton Modeling (HSM) has gained considerable attention in computer vision, exploring various practical applications such as the video surveillance, the human-computer interaction, the medical assistance analysis, and the autonomous driving through images and videos. The performance of HSM and its applications on challenging datasets has been significantly improved due to recent advancements of deep learning methods. These advancements have been extended to non-Euclidean or graph data with multiple nodes and edges. Because human joints and skeleton combinations are represented as graph structures, graph networks are appropriate for the non-Euclidean HSM. In recent years, graph networks have become essential tools for the HSM and behavioral analyses. However, prior surveys are often siloed, focusing either on a narrow class of models such as GCNs or on a single application like action recognition. A unified framework that systematically analyzes diverse graph network learning paradigms across the entire HSM pipeline has been notably absent. We conduct a survey of graph network methods for HSM and their application domains. This comprehensive overview includes a taxonomy of graph network techniques, a detailed study of benchmark datasets for HSM, extensive descriptions of the performance of graph networks in three major application domains, and a collection of related resources and open-source codes. Finally, we provided insightful recommendations for future research directions and trends of graph networks for HSM. This survey serves as the introductory material for beginners in graph network-based HSM and as the reference materials for advanced researchers.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"59 1","pages":""},"PeriodicalIF":13.9,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11442-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Artificial Intelligence Review
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1