首页 > 最新文献

Frontiers in Artificial Intelligence最新文献

英文 中文
Correction: Machine learning-based detection of cognitive decline using SSWTRT: classification performance and decision analysis. 更正:使用SSWTRT的基于机器学习的认知衰退检测:分类性能和决策分析。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-08 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1764066
Yuji Nozaki, Chihiro Kamohara, Ryota Abe, Taiki Ieda, Madoka Nakajima, Maki Sakamoto

[This corrects the article DOI: 10.3389/frai.2025.1689182.].

[这更正了文章DOI: 10.3389/frai.2025.1689182.]。
{"title":"Correction: Machine learning-based detection of cognitive decline using SSWTRT: classification performance and decision analysis.","authors":"Yuji Nozaki, Chihiro Kamohara, Ryota Abe, Taiki Ieda, Madoka Nakajima, Maki Sakamoto","doi":"10.3389/frai.2025.1764066","DOIUrl":"https://doi.org/10.3389/frai.2025.1764066","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/frai.2025.1689182.].</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1764066"},"PeriodicalIF":4.7,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12824601/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146054011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-modal AI in precision medicine: integrating genomics, imaging, and EHR data for clinical insights. 精准医疗中的多模式人工智能:整合基因组学、成像和电子病历数据以获得临床见解。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1743921
Shahper Nazeer Khan, Danishuddin, Mohd Wajid Ali Khan, Luca Guarnera, Syed Mohammad Fauzan Akhtar

Precision healthcare is increasingly oriented toward the development of therapeutic strategies that are as individualized as the patients receiving them. Central to this paradigm shift is artificial intelligence (AI)-enabled multi-modal data integration, which consolidates heterogeneous data streams-including genomic, transcriptomic, proteomic, imaging, environmental, and electronic health record (EHR) data into a unified analytical framework. This integrative approach enhances early disease detection, facilitates the discovery of clinically actionable biomarkers, and accelerates rational drug development, with particularly significant implications for oncology, neurology, and cardiovascular medicine. Advanced machine learning (ML) and deep learning (DL) algorithms are capable of extracting complex, non-linear associations across data modalities, thereby improving diagnostic precision, enabling robust risk stratification, and informing patient-specific therapeutic interventions. Furthermore, AI-driven applications in digital health, such as wearable biosensors and real-time physiological monitoring, allow for continuous, dynamic refinement of treatment plans. This review examines the transformative potential of multi-modal AI in precision medicine, with emphasis on its role in multi-omics data integration, predictive modeling, and clinical decision support. In parallel, it critically evaluates prevailing challenges, including data interoperability, algorithmic bias, and ethical considerations surrounding patient privacy. The synergistic convergence of AI and multi-modal data represents not merely a technological innovation but a fundamental redefinition of individualized healthcare delivery.

精准医疗越来越趋向于治疗策略的发展,这些治疗策略与接受治疗的患者一样个性化。这种模式转变的核心是支持人工智能(AI)的多模式数据集成,它将异构数据流(包括基因组、转录组、蛋白质组、成像、环境和电子健康记录(EHR)数据整合到统一的分析框架中。这种综合方法增强了疾病的早期检测,促进了临床可操作生物标志物的发现,并加速了合理的药物开发,对肿瘤、神经病学和心血管医学具有特别重要的意义。先进的机器学习(ML)和深度学习(DL)算法能够从数据模式中提取复杂的非线性关联,从而提高诊断精度,实现稳健的风险分层,并为患者特定的治疗干预提供信息。此外,人工智能驱动的数字健康应用,如可穿戴生物传感器和实时生理监测,允许持续、动态地改进治疗计划。本文综述了多模态人工智能在精准医学中的变革潜力,重点介绍了其在多组学数据集成、预测建模和临床决策支持方面的作用。同时,它批判性地评估当前的挑战,包括数据互操作性、算法偏见和围绕患者隐私的伦理考虑。人工智能和多模态数据的协同融合不仅代表了一种技术创新,而且代表了对个性化医疗服务的根本重新定义。
{"title":"Multi-modal AI in precision medicine: integrating genomics, imaging, and EHR data for clinical insights.","authors":"Shahper Nazeer Khan, Danishuddin, Mohd Wajid Ali Khan, Luca Guarnera, Syed Mohammad Fauzan Akhtar","doi":"10.3389/frai.2025.1743921","DOIUrl":"10.3389/frai.2025.1743921","url":null,"abstract":"<p><p>Precision healthcare is increasingly oriented toward the development of therapeutic strategies that are as individualized as the patients receiving them. Central to this paradigm shift is artificial intelligence (AI)-enabled multi-modal data integration, which consolidates heterogeneous data streams-including genomic, transcriptomic, proteomic, imaging, environmental, and electronic health record (EHR) data into a unified analytical framework. This integrative approach enhances early disease detection, facilitates the discovery of clinically actionable biomarkers, and accelerates rational drug development, with particularly significant implications for oncology, neurology, and cardiovascular medicine. Advanced machine learning (ML) and deep learning (DL) algorithms are capable of extracting complex, non-linear associations across data modalities, thereby improving diagnostic precision, enabling robust risk stratification, and informing patient-specific therapeutic interventions. Furthermore, AI-driven applications in digital health, such as wearable biosensors and real-time physiological monitoring, allow for continuous, dynamic refinement of treatment plans. This review examines the transformative potential of multi-modal AI in precision medicine, with emphasis on its role in multi-omics data integration, predictive modeling, and clinical decision support. In parallel, it critically evaluates prevailing challenges, including data interoperability, algorithmic bias, and ethical considerations surrounding patient privacy. The synergistic convergence of AI and multi-modal data represents not merely a technological innovation but a fundamental redefinition of individualized healthcare delivery.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1743921"},"PeriodicalIF":4.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12819606/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146030996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Use of machine learning models to predict mechanical ventilation, ECMO, and mortality in COVID-19. 使用机器学习模型预测COVID-19患者的机械通气、ECMO和死亡率。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1661637
Nina Moorman, Erin Hedlund-Botti, Grace Gombolay, Matthew C Gombolay

Introduction: Patients with severe COVID-19 may require MV or ECMO. Predicting who will require interventions and the duration of those interventions are challenging due to the diverse responses among patients and the dynamic nature of the disease. As such, there is a need for better prediction of the duration and outcomes of MV use in patients, to improve patient care and aid with MV and ECMO allocation. Here we develop and examine the performance of ML models to predict MV duration, ECMO, and mortality for patients with COVID-19.

Methods: In this retrospective prognostic study, hierarchical machine-learning models were developed to predict MV duration and outcome prediction from demographic data and time-series data consisting of vital signs and laboratory results. We train our models on 10,378 patients with positive severe acute respiratory syndrome-related coronavirus (SARS-CoV-2) virus testing from Emory's COVID CRADLE Dataset who sought treatment at Emory University Hospital between February 28, 2020, to January 24, 2022. Analysis was conducted between January 10, 2022, and April 5, 2024. The main outcomes and measures were the AUROC, AUPRC and the F-score for MV duration, need for ECMO, and mortality prediction.

Results: Data from 10,378 patients with COVID-19 (median [IQR] age, 60 [48-72] years; 5,281 [50.89%] women) were included. Overall MV class distributions for 0 days, 1-4 days, 5-9 days, 10-14 days, 15-19 days, 20-24 days, 25-29 days, and ≥30 days of MV were 8,141 (78.44%), 812 (7.82%), 325 (3.13%), 241 (2.32%), 153 (1.47%), 97 (0.93%), 87 (0.84%), and 522 (5.03%), respectively. Overall ECMO use and mortality rates were 15 (0.14%) and 1,114 (10.73%), respectively. On MV duration, ECMO use, and mortality outcomes, the highest-performing model reached weighted average AUROC scores of 0.873, 0.902, and 0.774, and the highest-performing model reached weighted average AUPRC scores of 0.790, 0.999, and 0.893.

Conclusions and relevance: Hierarchical ML models trained on vital signs, laboratory results, and demographic data show promise for the prediction of MV duration, ECMO use, and mortality in COVID-19 patients.

重症COVID-19患者可能需要MV或ECMO。由于患者的不同反应和疾病的动态性质,预测谁将需要干预以及这些干预的持续时间具有挑战性。因此,有必要更好地预测患者使用MV的持续时间和结果,以改善患者护理并协助MV和ECMO的分配。在这里,我们开发并检验了ML模型的性能,以预测COVID-19患者的MV持续时间、ECMO和死亡率。方法:在这项回顾性预后研究中,开发了分层机器学习模型,根据人口统计数据和由生命体征和实验室结果组成的时间序列数据预测MV持续时间和结局预测。我们对10378名严重急性呼吸综合征相关冠状病毒(SARS-CoV-2)病毒检测呈阳性的患者进行了模型训练,这些患者来自埃默里大学的COVID - CRADLE数据集,他们在2020年2月28日至2022年1月24日期间在埃默里大学医院寻求治疗。分析时间为2022年1月10日至2024年4月5日。主要结果和指标为AUROC、AUPRC和MV持续时间f评分、ECMO需求和死亡率预测。结果:纳入10378例COVID-19患者的数据(中位[IQR]年龄为60[48-72]岁;5281例[50.89%]女性)。0天、1-4天、5-9天、10-14天、15-19天、20-24天、25-29天和≥30天的总MV级分布分别为8141(78.44%)、812(7.82%)、325(3.13%)、241(2.32%)、153(1.47%)、97(0.93%)、87(0.84%)和522(5.03%)。总体ECMO使用率和死亡率分别为15例(0.14%)和1114例(10.73%)。在MV持续时间、ECMO使用和死亡率结果方面,表现最好的模型AUROC加权平均得分分别为0.873、0.902和0.774,表现最好的模型AUPRC加权平均得分分别为0.790、0.999和0.893。结论和相关性:基于生命体征、实验室结果和人口统计学数据训练的分层机器学习模型有望预测COVID-19患者的MV持续时间、ECMO使用和死亡率。
{"title":"Use of machine learning models to predict mechanical ventilation, ECMO, and mortality in COVID-19.","authors":"Nina Moorman, Erin Hedlund-Botti, Grace Gombolay, Matthew C Gombolay","doi":"10.3389/frai.2025.1661637","DOIUrl":"10.3389/frai.2025.1661637","url":null,"abstract":"<p><strong>Introduction: </strong>Patients with severe COVID-19 may require MV or ECMO. Predicting who will require interventions and the duration of those interventions are challenging due to the diverse responses among patients and the dynamic nature of the disease. As such, there is a need for better prediction of the duration and outcomes of MV use in patients, to improve patient care and aid with MV and ECMO allocation. Here we develop and examine the performance of ML models to predict MV duration, ECMO, and mortality for patients with COVID-19.</p><p><strong>Methods: </strong>In this retrospective prognostic study, hierarchical machine-learning models were developed to predict MV duration and outcome prediction from demographic data and time-series data consisting of vital signs and laboratory results. We train our models on 10,378 patients with positive severe acute respiratory syndrome-related coronavirus (SARS-CoV-2) virus testing from Emory's COVID CRADLE Dataset who sought treatment at Emory University Hospital between February 28, 2020, to January 24, 2022. Analysis was conducted between January 10, 2022, and April 5, 2024. The main outcomes and measures were the AUROC, AUPRC and the F-score for MV duration, need for ECMO, and mortality prediction.</p><p><strong>Results: </strong>Data from 10,378 patients with COVID-19 (median [IQR] age, 60 [48-72] years; 5,281 [50.89%] women) were included. Overall MV class distributions for 0 days, 1-4 days, 5-9 days, 10-14 days, 15-19 days, 20-24 days, 25-29 days, and ≥30 days of MV were 8,141 (78.44%), 812 (7.82%), 325 (3.13%), 241 (2.32%), 153 (1.47%), 97 (0.93%), 87 (0.84%), and 522 (5.03%), respectively. Overall ECMO use and mortality rates were 15 (0.14%) and 1,114 (10.73%), respectively. On MV duration, ECMO use, and mortality outcomes, the highest-performing model reached weighted average AUROC scores of 0.873, 0.902, and 0.774, and the highest-performing model reached weighted average AUPRC scores of 0.790, 0.999, and 0.893.</p><p><strong>Conclusions and relevance: </strong>Hierarchical ML models trained on vital signs, laboratory results, and demographic data show promise for the prediction of MV duration, ECMO use, and mortality in COVID-19 patients.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1661637"},"PeriodicalIF":4.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12816323/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding user perceptions of DeepSeek: insights from sentiment, topic and network analysis using a Reddit-based study. 了解用户对DeepSeek的看法:使用基于reddit的研究,从情感、主题和网络分析中获得见解。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1703949
Naisarg Patel, Rajesh Sharma, Prakash Lingasamy, Vino Sundararajan, Sajitha Lulu Sudhakaran, Vijayachitra Modhukur

Introduction: The launch of DeepSeek, a Chinese open-source generative AI model, generated substantial discussion regarding its capabilities and implications. The r/deepseek subreddit emerged as a key forum for real-time public evaluation. Analyzing this discourse is essential for understanding the sociotechnical perceptions shaping the integration of emerging AI systems.

Methods: We analyzed 46,649 posts and comments from r/deepseek (January-May 2025) using a computational framework combining VADER sentiment analysis, Hartmann emotion classification, BERTopic for thematic modeling, hyperlink extraction, and directed network analysis. Data preprocessing included cleaning, normalization, and lemmatization. We also examined correlations between sentiment/emotion scores and dominant topics.

Results: Sentiment was predominantly positive (posts: 47.23%; comments: 44.26%), with neutral sentiment comprising ~30% of content. The most frequent emotion was neutrality, followed by surprise and fear, indicating ambivalent user reactions. Prominent topics included open-source AI models, DeepSeek usage, device compatibility, comparisons with ChatGPT, and censorship concerns. Hyperlink analysis indicated strong engagement with GitHub, Hugging Face, and DeepSeek's own services. Network analysis revealed a fragmented but active community, depicting Open-Source AI Models as the most cohesive cluster.

Discussion: Community discourse framed DeepSeek as both a technical tool and a geopolitical issue. Enthusiasm centered on its performance, accessibility, and open-source nature, while concerns were voiced about censorship, data privacy, and potential ideological influence. The integrated analysis shows that collective perception emerged through decentralized, dialogic engagement, reflecting broader sociotechnical tensions related to openness, trust, and legitimacy in global AI development.

导读:中国开源生成人工智能模型DeepSeek的推出,引发了关于其功能和影响的大量讨论。reddit的r/deepseek子论坛成为实时公众评估的关键论坛。分析这一论述对于理解塑造新兴人工智能系统集成的社会技术观念至关重要。方法:利用VADER情感分析、Hartmann情感分类、BERTopic主题建模、超链接提取和定向网络分析相结合的计算框架,对r/deepseek(2025年1 - 5月)上的46649篇帖子和评论进行分析。数据预处理包括清理、规范化和归纳。我们还研究了情绪/情绪得分与主导话题之间的相关性。结果:情绪主要是积极的(帖子:47.23%;评论:44.26%),中性情绪约占内容的30%。最常见的情绪是中立,其次是惊讶和恐惧,表明用户的反应是矛盾的。突出的话题包括开源人工智能模型、DeepSeek的使用、设备兼容性、与ChatGPT的比较以及审查问题。超链接分析表明,该公司与GitHub、hug Face和DeepSeek自己的服务有着密切的联系。网络分析揭示了一个分散但活跃的社区,将开源人工智能模型描述为最具凝聚力的集群。讨论:社区讨论将DeepSeek视为技术工具和地缘政治问题。人们的热情集中在它的性能、可访问性和开源性质上,同时也表达了对审查、数据隐私和潜在意识形态影响的担忧。综合分析表明,集体感知是通过分散的对话参与产生的,反映了全球人工智能发展中与开放性、信任和合法性相关的更广泛的社会技术紧张关系。
{"title":"Understanding user perceptions of DeepSeek: insights from sentiment, topic and network analysis using a Reddit-based study.","authors":"Naisarg Patel, Rajesh Sharma, Prakash Lingasamy, Vino Sundararajan, Sajitha Lulu Sudhakaran, Vijayachitra Modhukur","doi":"10.3389/frai.2025.1703949","DOIUrl":"10.3389/frai.2025.1703949","url":null,"abstract":"<p><strong>Introduction: </strong>The launch of DeepSeek, a Chinese open-source generative AI model, generated substantial discussion regarding its capabilities and implications. The r/deepseek subreddit emerged as a key forum for real-time public evaluation. Analyzing this discourse is essential for understanding the sociotechnical perceptions shaping the integration of emerging AI systems.</p><p><strong>Methods: </strong>We analyzed 46,649 posts and comments from r/deepseek (January-May 2025) using a computational framework combining VADER sentiment analysis, Hartmann emotion classification, BERTopic for thematic modeling, hyperlink extraction, and directed network analysis. Data preprocessing included cleaning, normalization, and lemmatization. We also examined correlations between sentiment/emotion scores and dominant topics.</p><p><strong>Results: </strong>Sentiment was predominantly positive (posts: 47.23%; comments: 44.26%), with neutral sentiment comprising ~30% of content. The most frequent emotion was neutrality, followed by surprise and fear, indicating ambivalent user reactions. Prominent topics included open-source AI models, DeepSeek usage, device compatibility, comparisons with ChatGPT, and censorship concerns. Hyperlink analysis indicated strong engagement with GitHub, Hugging Face, and DeepSeek's own services. Network analysis revealed a fragmented but active community, depicting Open-Source AI Models as the most cohesive cluster.</p><p><strong>Discussion: </strong>Community discourse framed DeepSeek as both a technical tool and a geopolitical issue. Enthusiasm centered on its performance, accessibility, and open-source nature, while concerns were voiced about censorship, data privacy, and potential ideological influence. The integrated analysis shows that collective perception emerged through decentralized, dialogic engagement, reflecting broader sociotechnical tensions related to openness, trust, and legitimacy in global AI development.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1703949"},"PeriodicalIF":4.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12816320/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tracing strategic divergence: archetypal and counterfactual analysis of StarCraft II gameplay trajectories. 追踪战略分歧:《星际争霸2》玩法轨迹的原型和反事实分析
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1724493
Jie Zhang, Weilong Yang

Introduction: To address the challenges of data heterogeneity, strategic diversity, and process opacity in interpreting multi-agent decision-making within complex competitive environments, we have developed TRACE, an end-to-end analytical framework for StarCraft II gameplay.

Methods: This framework standardizes raw replay data into aligned state trajectories, extracts "typical strategic progressions" using a Conditional Recurrent Variational Autoencoder (C-RVAE), and quantifies the deviation of individual games from these archetypes via counterfactual alignment. Its core innovation is the introduction of a dimensionless deviation metric, |Δ|, which achieves process-level interpretability. This metric reveals "which elements are important" by ranking time-averaged feature contributions across aggregated categories (Economy, Military, Technology) and shows "when deviations occur" through temporal heatmaps, forging a verifiable evidence chain..

Results: Quantitative evaluation on professional tournament datasets demonstrates the framework's robustness, revealing that strategic deviations often crystallize in the early game (averaging 8.4% of match duration) and are frequently driven by critical technology timing gaps. The counterfactual generation module effectively restores strategic alignment, achieving an average similarity improvement of over 90% by correcting identified divergences. Furthermore, expert human evaluation confirms the practical utility of the system, awarding high scores for Factual Fidelity (4.6/5.0) and Causal Coherence (4.3/5.0) to the automatically generated narratives.

Discussion: By providing openaccess code and reproducible datasets, TRACE lowers the barrier to large-scale replay analysis, offering an operational quantitative basis for macro-strategy understanding, coaching reviews, and AI model evaluation.

为了解决在复杂的竞争环境中解释多智能体决策的数据异质性、战略多样性和过程不透明性的挑战,我们开发了TRACE,这是一个用于星际争霸II游戏玩法的端到端分析框架。方法:该框架将原始重放数据标准化为对齐状态轨迹,使用条件递归变分自编码器(C-RVAE)提取“典型策略进展”,并通过反事实对齐量化单个游戏与这些原型的偏差。其核心创新是引入了无量纲偏差度量|Δ|,实现了过程级的可解释性。该指标通过对综合类别(经济、军事、技术)的时间平均特征贡献进行排名,揭示了“哪些元素是重要的”,并通过时间热图显示了“偏差何时发生”,形成了一个可验证的证据链。对职业比赛数据集的定量评估证明了该框架的稳健性,揭示了战略偏差通常在比赛早期(平均为比赛持续时间的8.4%)明确,并且经常由关键技术时间差距驱动。反事实生成模块有效地恢复了战略一致性,通过纠正已识别的差异,实现了超过90%的平均相似性改进。此外,专家评估证实了该系统的实用性,自动生成的叙述在事实保真度(4.6/5.0)和因果一致性(4.3/5.0)方面获得了高分。讨论:通过提供开放访问代码和可重复的数据集,TRACE降低了大规模重播分析的障碍,为宏观战略理解、指导审查和AI模型评估提供了可操作的定量基础。
{"title":"Tracing strategic divergence: archetypal and counterfactual analysis of StarCraft II gameplay trajectories.","authors":"Jie Zhang, Weilong Yang","doi":"10.3389/frai.2025.1724493","DOIUrl":"10.3389/frai.2025.1724493","url":null,"abstract":"<p><strong>Introduction: </strong>To address the challenges of data heterogeneity, strategic diversity, and process opacity in interpreting multi-agent decision-making within complex competitive environments, we have developed TRACE, an end-to-end analytical framework for StarCraft II gameplay.</p><p><strong>Methods: </strong>This framework standardizes raw replay data into aligned state trajectories, extracts \"typical strategic progressions\" using a Conditional Recurrent Variational Autoencoder (C-RVAE), and quantifies the deviation of individual games from these archetypes via counterfactual alignment. Its core innovation is the introduction of a dimensionless deviation metric, |Δ|, which achieves process-level interpretability. This metric reveals \"which elements are important\" by ranking time-averaged feature contributions across aggregated categories (Economy, Military, Technology) and shows \"when deviations occur\" through temporal heatmaps, forging a verifiable evidence chain..</p><p><strong>Results: </strong>Quantitative evaluation on professional tournament datasets demonstrates the framework's robustness, revealing that strategic deviations often crystallize in the early game (averaging 8.4% of match duration) and are frequently driven by critical technology timing gaps. The counterfactual generation module effectively restores strategic alignment, achieving an average similarity improvement of over 90% by correcting identified divergences. Furthermore, expert human evaluation confirms the practical utility of the system, awarding high scores for Factual Fidelity (4.6/5.0) and Causal Coherence (4.3/5.0) to the automatically generated narratives.</p><p><strong>Discussion: </strong>By providing openaccess code and reproducible datasets, TRACE lowers the barrier to large-scale replay analysis, offering an operational quantitative basis for macro-strategy understanding, coaching reviews, and AI model evaluation.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1724493"},"PeriodicalIF":4.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12816306/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing intelligent chatbots with ChatGPT: a framework for development and implementation. 使用ChatGPT设计智能聊天机器人:用于开发和实现的框架。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1618791
Sajjad Hyder, Javeed Kittur

Background: The rapid evolution of interactive AI has reshaped human-computer interaction, with ChatGPT emerging as a key tool for chatbot development. Industries such as healthcare, customer service, and education increasingly integrate chatbots, highlighting the need for a structured development framework.

Purpose: This study proposes a framework for designing intelligent chatbots using ChatGPT, focusing on user experience, hybrid design models, prompt engineering, and system limitations. The framework aims to bridge the gap between technical innovation and real-world application.

Methods: A systematic literature review (SLR) was conducted, analyzing 40 relevant studies. The research was structured around three key questions: (1) How do user experience and engagement influence chatbot performance? (2) How do hybrid design models improve chatbot performance? (3) What are the limitations of using ChatGPT, and how does prompt engineering affect responses?

Results: The findings emphasize that well-designed user interactions enhance engagement and trust. Hybrid models integrating rule-based and machine learning techniques improve chatbot functionality. However, challenges such as response inconsistencies, ethical concerns, and prompt sensitivity require careful consideration. A framework for design, development, and implementation of effective Chatbots with ChatGPT has been proposed in this study.

Conclusion: This study provides a structured framework for chatbot development with ChatGPT, offering insights into optimizing user experience, leveraging hybrid design, and mitigating limitations. The proposed framework serves as a practical guide for researchers, developers, and businesses aiming to create intelligent, user-centric chatbot solutions.

背景:交互式人工智能的快速发展重塑了人机交互,ChatGPT成为聊天机器人开发的关键工具。医疗保健、客户服务和教育等行业越来越多地集成聊天机器人,这凸显了对结构化开发框架的需求。目的:本研究提出了一个使用ChatGPT设计智能聊天机器人的框架,重点关注用户体验、混合设计模型、提示工程和系统限制。该框架旨在弥合技术创新与实际应用之间的差距。方法:采用系统文献复习法(SLR),对40项相关研究进行分析。该研究围绕三个关键问题展开:(1)用户体验和参与度如何影响聊天机器人的性能?(2)混合设计模型如何提高聊天机器人的性能?(3)使用ChatGPT的局限性是什么?提示工程如何影响响应?结果:研究结果强调,设计良好的用户交互可以提高用户粘性和信任度。集成基于规则和机器学习技术的混合模型提高了聊天机器人的功能。然而,诸如响应不一致、伦理问题和迅速敏感性等挑战需要仔细考虑。本研究提出了一个基于ChatGPT的有效聊天机器人的设计、开发和实现框架。结论:本研究为ChatGPT聊天机器人开发提供了一个结构化框架,为优化用户体验、利用混合设计和减轻限制提供了见解。该框架为旨在创建智能、以用户为中心的聊天机器人解决方案的研究人员、开发人员和企业提供了实用指南。
{"title":"Designing intelligent chatbots with ChatGPT: a framework for development and implementation.","authors":"Sajjad Hyder, Javeed Kittur","doi":"10.3389/frai.2025.1618791","DOIUrl":"10.3389/frai.2025.1618791","url":null,"abstract":"<p><strong>Background: </strong>The rapid evolution of interactive AI has reshaped human-computer interaction, with ChatGPT emerging as a key tool for chatbot development. Industries such as healthcare, customer service, and education increasingly integrate chatbots, highlighting the need for a structured development framework.</p><p><strong>Purpose: </strong>This study proposes a framework for designing intelligent chatbots using ChatGPT, focusing on user experience, hybrid design models, prompt engineering, and system limitations. The framework aims to bridge the gap between technical innovation and real-world application.</p><p><strong>Methods: </strong>A systematic literature review (SLR) was conducted, analyzing 40 relevant studies. The research was structured around three key questions: (1) How do user experience and engagement influence chatbot performance? (2) How do hybrid design models improve chatbot performance? (3) What are the limitations of using ChatGPT, and how does prompt engineering affect responses?</p><p><strong>Results: </strong>The findings emphasize that well-designed user interactions enhance engagement and trust. Hybrid models integrating rule-based and machine learning techniques improve chatbot functionality. However, challenges such as response inconsistencies, ethical concerns, and prompt sensitivity require careful consideration. A framework for design, development, and implementation of effective Chatbots with ChatGPT has been proposed in this study.</p><p><strong>Conclusion: </strong>This study provides a structured framework for chatbot development with ChatGPT, offering insights into optimizing user experience, leveraging hybrid design, and mitigating limitations. The proposed framework serves as a practical guide for researchers, developers, and businesses aiming to create intelligent, user-centric chatbot solutions.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1618791"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12812903/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GECOBench: a gender-controlled text dataset and benchmark for quantifying biases in explanations. gecbench:一个性别控制的文本数据集和量化解释偏差的基准。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1694388
Rick Wilming, Artur Dox, Hjalmar Schulz, Marta Oliveira, Benedict Clark, Stefan Haufe

Large pre-trained language models have become a crucial backbone for many downstream tasks in natural language processing (NLP), and while they are trained on a plethora of data containing a variety of biases, such as gender biases, it has been shown that they can also inherit such biases in their weights, potentially affecting their prediction behavior. However, it is unclear to what extent these biases also affect feature attributions generated by applying "explainable artificial intelligence" (XAI) techniques, possibly in unfavorable ways. To systematically study this question, we create a gender-controlled text dataset, GECO, in which the alteration of grammatical gender forms induces class-specific words and provides ground truth feature attributions for gender classification tasks. This enables an objective evaluation of the correctness of XAI methods. We apply this dataset to the pre-trained BERT model, which we fine-tune to different degrees, to quantitatively measure how pre-training induces undesirable bias in feature attributions and to what extent fine-tuning can mitigate such explanation bias. To this extent, we provide GECOBench, a rigorous quantitative evaluation framework for benchmarking popular XAI methods. We show a clear dependency between explanation performance and the number of fine-tuned layers, where XAI methods are observed to benefit particularly from fine-tuning or complete retraining of embedding layers.

大型预训练语言模型已经成为自然语言处理(NLP)中许多下游任务的关键支柱,虽然它们是在包含各种偏差(如性别偏差)的大量数据上训练的,但研究表明,它们也可以在权重中继承这种偏差,从而潜在地影响它们的预测行为。然而,目前尚不清楚这些偏差在多大程度上也会影响应用“可解释的人工智能”(XAI)技术产生的特征归因,可能以不利的方式。为了系统地研究这个问题,我们创建了一个性别控制的文本数据集GECO,其中语法性别形式的变化诱导了特定类别的单词,并为性别分类任务提供了基本真理特征归因。这样就可以客观地评价XAI方法的正确性。我们将该数据集应用于预训练的BERT模型,并对其进行不同程度的微调,以定量测量预训练如何在特征归因中引起不良偏差,以及微调可以在多大程度上减轻这种解释偏差。在这种程度上,我们提供了gecbench,这是一个严格的定量评估框架,用于对流行的XAI方法进行基准测试。我们展示了解释性能与微调层数量之间的明确依赖关系,其中观察到XAI方法特别受益于微调或完全重新训练嵌入层。
{"title":"GECOBench: a gender-controlled text dataset and benchmark for quantifying biases in explanations.","authors":"Rick Wilming, Artur Dox, Hjalmar Schulz, Marta Oliveira, Benedict Clark, Stefan Haufe","doi":"10.3389/frai.2025.1694388","DOIUrl":"10.3389/frai.2025.1694388","url":null,"abstract":"<p><p>Large pre-trained language models have become a crucial backbone for many downstream tasks in natural language processing (NLP), and while they are trained on a plethora of data containing a variety of biases, such as gender biases, it has been shown that they can also inherit such biases in their weights, potentially affecting their prediction behavior. However, it is unclear to what extent these biases also affect feature attributions generated by applying \"explainable artificial intelligence\" (XAI) techniques, possibly in unfavorable ways. To systematically study this question, we create a gender-controlled text dataset, GECO, in which the alteration of grammatical gender forms induces class-specific words and provides ground truth feature attributions for gender classification tasks. This enables an objective evaluation of the correctness of XAI methods. We apply this dataset to the pre-trained BERT model, which we fine-tune to different degrees, to quantitatively measure how pre-training induces undesirable bias in feature attributions and to what extent fine-tuning can mitigate such explanation bias. To this extent, we provide GECOBench, a rigorous quantitative evaluation framework for benchmarking popular XAI methods. We show a clear dependency between explanation performance and the number of fine-tuned layers, where XAI methods are observed to benefit particularly from fine-tuning or complete retraining of embedding layers.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1694388"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12813014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional neural networks and mixture of experts for intrusion detection in 5G networks and beyond. 卷积神经网络和混合专家在5G网络及以后的入侵检测。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1708953
Loukas Ilias, George Doukas, Vangelis Lamprou, Christos Ntanos, Dimitris Askounis

The advent of 6G/NextG networks offers numerous benefits, including extreme capacity, reliability, and efficiency. To mitigate emerging security threats, 6G/NextG networks incorporate advanced artificial intelligence algorithms. However, existing studies on intrusion detection predominantly rely on deep neural networks with static components that are not conditionally dependent on the input, thereby limiting their representational power and efficiency. To address these issues, we present the first study to integrate a Mixture of Experts (MoE) architecture for the identification of malicious traffic. Specifically, we use network traffic data and convert the 1D feature array into a 2D matrix. Next, we pass this matrix through a convolutional neural network (CNN) layer, followed by batch normalization and max pooling layers. Subsequently, a sparsely gated MoE layer is used. This layer consists of a set of expert networks (dense layers) and a router that assigns weights to each expert's output. Sparsity is achieved by selecting only the most relevant experts from the full set. Finally, we conduct a series of ablation experiments to demonstrate the effectiveness of our proposed model. Experiments are conducted on the 5G-NIDD dataset, a network intrusion detection dataset generated from a real 5G test network, and the NANCY dataset, which includes cyberattacks from the O-RAN 5G Testbed Dataset. The results show that our introduced approach achieves accuracies of up to 99.96% and 79.59% on the 5G-NIDD and NANCY datasets, respectively. The findings also show that our proposed model offers multiple advantages over state-of-the-art approaches.

6G/NextG网络的出现带来了许多好处,包括极高的容量、可靠性和效率。为了缓解新出现的安全威胁,6G/NextG网络采用了先进的人工智能算法。然而,现有的入侵检测研究主要依赖于具有静态组件的深度神经网络,这些静态组件不依赖于输入,从而限制了它们的表示能力和效率。为了解决这些问题,我们提出了第一个集成混合专家(MoE)架构来识别恶意流量的研究。具体来说,我们使用网络流量数据并将一维特征数组转换为二维矩阵。接下来,我们通过卷积神经网络(CNN)层传递这个矩阵,然后是批处理归一化和最大池化层。随后,使用稀疏门控的MoE层。该层由一组专家网络(密集层)和一个路由器组成,该路由器为每个专家的输出分配权重。稀疏性是通过从全部专家集中只选择最相关的专家来实现的。最后,我们进行了一系列的烧蚀实验来验证我们提出的模型的有效性。实验分别在5G真实测试网络生成的网络入侵检测数据集5G- nidd数据集和O-RAN 5G测试平台数据集网络攻击数据集NANCY数据集上进行。结果表明,该方法在5G-NIDD和NANCY数据集上的准确率分别高达99.96%和79.59%。研究结果还表明,我们提出的模型与最先进的方法相比具有多种优势。
{"title":"Convolutional neural networks and mixture of experts for intrusion detection in 5G networks and beyond.","authors":"Loukas Ilias, George Doukas, Vangelis Lamprou, Christos Ntanos, Dimitris Askounis","doi":"10.3389/frai.2025.1708953","DOIUrl":"10.3389/frai.2025.1708953","url":null,"abstract":"<p><p>The advent of 6G/NextG networks offers numerous benefits, including extreme capacity, reliability, and efficiency. To mitigate emerging security threats, 6G/NextG networks incorporate advanced artificial intelligence algorithms. However, existing studies on intrusion detection predominantly rely on deep neural networks with static components that are not conditionally dependent on the input, thereby limiting their representational power and efficiency. To address these issues, we present the first study to integrate a Mixture of Experts (MoE) architecture for the identification of malicious traffic. Specifically, we use network traffic data and convert the 1D feature array into a 2D matrix. Next, we pass this matrix through a convolutional neural network (CNN) layer, followed by batch normalization and max pooling layers. Subsequently, a sparsely gated MoE layer is used. This layer consists of a set of expert networks (dense layers) and a router that assigns weights to each expert's output. Sparsity is achieved by selecting only the most relevant experts from the full set. Finally, we conduct a series of ablation experiments to demonstrate the effectiveness of our proposed model. Experiments are conducted on the 5G-NIDD dataset, a network intrusion detection dataset generated from a real 5G test network, and the NANCY dataset, which includes cyberattacks from the O-RAN 5G Testbed Dataset. The results show that our introduced approach achieves accuracies of up to 99.96% and 79.59% on the 5G-NIDD and NANCY datasets, respectively. The findings also show that our proposed model offers multiple advantages over state-of-the-art approaches.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1708953"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12813042/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Turing Test for artificial nets devoted to vision. 用于视觉的人工网络的图灵测试。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1665874
Jorge Vila-Tomás, Pablo Hernández-Cámara, Qiang Li, Valero Laparra, Jesús Malo

In this work we argue that, despite recent claims about successful modeling of the visual brain using deep nets, the problem is far from being solved, particularly for low-level vision. Open issues include where should we read from in ANNs to check behavior? What should be the read-out? Is this ad-hoc read-out considered part of the brain model or not? In order to understand vision-ANNs, should we use artificial psychophysics or artificial physiology? Anyhow, should artificial tests literally match the experiments done with humans? These questions suggest a clear need for biologically sensible tests for deep models of the visual brain, and more generally, to understand ANNs devoted to generic vision tasks. Following our use of low-level facts from Vision Science in Image Processing, we present a low-level dataset compiling the basic spatio-chromatic properties that describe the adaptive bottleneck of the retina-V1 pathway and are not currently available in popular databases such as BrainScore. We propose its use for qualitative and quantitative model evaluation. As an illustration of the proposed methods, we check the behavior of three recent models with similar deep architectures: (1) A parametric model tuned via the psychophysical method of Maximum Differentiation [Malo & Simoncelli SPIE 15, Martinez et al. PLOS 18, Martinez et al. Front. Neurosci. 19], (2) A non-parametric model (the PerceptNet) tuned to maximize the correlation with humans on subjective image distortions [Hepburn et al. IEEE ICIP 20], and (3) A model with the same encoder as the PerceptNet, but tuned for image segmentation [Hernandez-Camara et al. Patt.Recogn.Lett. 23, Hernandez-Camara et al. Neurocomp. 25]. Results on the proposed 10 compelling psycho/physio visual properties show that the first (parametric) model is the one with behavior closest to humans.

在这项工作中,我们认为,尽管最近声称使用深度网络成功地建立了视觉大脑模型,但这个问题远未得到解决,特别是对于低层次视觉。开放的问题包括我们应该从哪里读取人工神经网络来检查行为?输出应该是什么?这种特别的读出是否是大脑模型的一部分?为了理解视觉人工神经网络,我们应该使用人工心理物理学还是人工生理学?无论如何,人工测试是否应该与人类实验相匹配?这些问题表明,显然需要对视觉大脑的深层模型进行生物学上合理的测试,更广泛地说,需要理解致力于一般视觉任务的人工神经网络。在我们使用图像处理中视觉科学的低级事实之后,我们提出了一个低级数据集,汇编了描述视网膜- v1通路自适应瓶颈的基本空间-色彩属性,这些属性目前在BrainScore等流行数据库中不可用。我们建议将其用于定性和定量模型评估。为了说明所提出的方法,我们检查了具有相似深度架构的三个最新模型的行为:(1)通过最大微分的心理物理方法调整的参数模型[Malo & simmoncelli SPIE 15, Martinez等人]。PLOS 18, Martinez等人。前面。(2)一种非参数模型(PerceptNet),用于最大化主观图像失真与人类的相关性[Hepburn et al.]。(3)与PerceptNet具有相同编码器的模型,但对图像分割进行了调整[Hernandez-Camara等人]。Patt.Recogn.Lett。23、Hernandez-Camara等。Neurocomp。25)。关于提出的10个引人注目的心理/生理视觉特性的结果表明,第一个(参数化)模型是最接近人类行为的模型。
{"title":"A Turing Test for artificial nets devoted to vision.","authors":"Jorge Vila-Tomás, Pablo Hernández-Cámara, Qiang Li, Valero Laparra, Jesús Malo","doi":"10.3389/frai.2025.1665874","DOIUrl":"10.3389/frai.2025.1665874","url":null,"abstract":"<p><p>In this work we argue that, despite recent claims about successful modeling of the visual brain using deep nets, the problem is far from being solved, particularly for low-level vision. Open issues include <i>where should we read from in ANNs to check behavior? What should be the read-out? Is this ad-hoc read-out considered part of the brain model or not?</i> In order to understand vision-ANNs, <i>should we use artificial psychophysics or artificial physiology?</i> Anyhow, <i>should artificial tests literally match the experiments done with humans?</i> These questions suggest a clear need for biologically sensible tests for deep models of the visual brain, and more generally, to understand ANNs devoted to generic vision tasks. Following our use of low-level facts from <i>Vision Science</i> in Image Processing, we present a low-level dataset compiling the basic spatio-chromatic properties that describe the adaptive bottleneck of the retina-V1 pathway and are not currently available in popular databases such as BrainScore. We propose its use for qualitative and quantitative model evaluation. As an illustration of the proposed methods, we check the behavior of three recent models with similar deep architectures: (1) A parametric model tuned via the psychophysical method of Maximum Differentiation [Malo & Simoncelli SPIE 15, Martinez et al. PLOS 18, Martinez et al. Front. Neurosci. 19], (2) A non-parametric model (the <i>PerceptNet</i>) tuned to maximize the correlation with humans on subjective image distortions [Hepburn et al. IEEE ICIP 20], and (3) A model with the same encoder as the <i>PerceptNet</i>, but tuned for image segmentation [Hernandez-Camara et al. Patt.Recogn.Lett. 23, Hernandez-Camara et al. Neurocomp. 25]. Results on the proposed 10 compelling psycho/physio visual properties show that the first (parametric) model is the one with behavior closest to humans.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1665874"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12812997/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Anatomical study and early diagnosis of dome galls in Cordia Dichotoma using DeepSVM model. 基于深度支持向量机(DeepSVM)模型的球囊瘤解剖及早期诊断研究。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1558358
Said Khalid Shah, Mazliham Bin Mohd Su'ud, Aurangzeb Khan, Muhammad Mansoor Alam, Muhammad Ayaz

Introduction: Artificial intelligence (AI), particularly deep learning (DL), offers automated solutions for early detection of plant diseases to improve crop yield. However, training accurate models on real-field data remains challenging due to over fitting and limited generalization. As observed in prior studies, traditional CNNs often struggle with real-environment variability, and transfer learning can lead to instability in training on domain-specific leaf datasets. This study focuses on detecting dome galls, a disease in Cordia dichotoma, by formulating a binary classification task (healthy vs. diseased leaves) using a custom dataset of 3,900 leaf images collected from real field environments.

Methods: Initially, both custom CNNs and transfer learning models were trained and compared. Among them, a modified ResNet-50 architecture showed promising results but suffered from over fitting and unstable convergence. To address this, the final sigmoid activation layer was replaced with a Support Vector Machine (SVM), and L2 regularization was applied to reduce over fitting. This hybrid DeepSVM architecture stabilized training and improved model robustness. Image preprocessing and augmentation techniques were applied to increase variability and prevent over fitting.

Results: The final model was evaluated on a separate test set of 400 images, and the results remained stable across repeated runs. DeepSVM achieved an accuracy of 94.50% and an F1-score of 94.47%, outperforming other well-known models like VGG-16, InceptionResNetv2, and MobileNet-V2.

Conclusion: These results indicate that the proposed DeepSVM approach offers better generalization and training stability than conventional CNN classifiers, potentially aiding in automated disease monitoring for precision agriculture.

人工智能(AI),特别是深度学习(DL),为植物病害的早期检测提供了自动化解决方案,以提高作物产量。然而,由于过度拟合和有限的泛化,在实际数据上训练准确的模型仍然具有挑战性。正如在之前的研究中所观察到的,传统的cnn经常与真实环境的可变性作斗争,并且迁移学习可能导致在特定领域叶数据集上的训练不稳定。本研究的重点是通过使用从真实野外环境中收集的3900张叶子图像的自定义数据集,制定一个二分类任务(健康与患病叶片),来检测圆顶瘿病,这是一种Cordia二分类病。方法:首先,对自定义cnn和迁移学习模型进行训练和比较。其中,一种改进的ResNet-50体系结构表现出良好的效果,但存在过拟合和收敛不稳定的问题。为了解决这个问题,最后的s形激活层被支持向量机(SVM)取代,并应用L2正则化来减少过拟合。这种混合的深度支持向量机结构稳定了训练,提高了模型的鲁棒性。图像预处理和增强技术应用于增加可变性和防止过度拟合。结果:最终模型在400张图像的单独测试集上进行评估,并且在重复运行时结果保持稳定。DeepSVM的准确率为94.50%,f1得分为94.47%,优于VGG-16、InceptionResNetv2、MobileNet-V2等知名模型。结论:这些结果表明,所提出的DeepSVM方法比传统的CNN分类器具有更好的泛化和训练稳定性,可能有助于精准农业的自动化疾病监测。
{"title":"Anatomical study and early diagnosis of dome galls in <i>Cordia Dichotoma</i> using DeepSVM model.","authors":"Said Khalid Shah, Mazliham Bin Mohd Su'ud, Aurangzeb Khan, Muhammad Mansoor Alam, Muhammad Ayaz","doi":"10.3389/frai.2025.1558358","DOIUrl":"10.3389/frai.2025.1558358","url":null,"abstract":"<p><strong>Introduction: </strong>Artificial intelligence (AI), particularly deep learning (DL), offers automated solutions for early detection of plant diseases to improve crop yield. However, training accurate models on real-field data remains challenging due to over fitting and limited generalization. As observed in prior studies, traditional CNNs often struggle with real-environment variability, and transfer learning can lead to instability in training on domain-specific leaf datasets. This study focuses on detecting dome galls, a disease in <i>Cordia dichotoma</i>, by formulating a binary classification task (healthy vs. diseased leaves) using a custom dataset of 3,900 leaf images collected from real field environments.</p><p><strong>Methods: </strong>Initially, both custom CNNs and transfer learning models were trained and compared. Among them, a modified ResNet-50 architecture showed promising results but suffered from over fitting and unstable convergence. To address this, the final sigmoid activation layer was replaced with a Support Vector Machine (SVM), and L2 regularization was applied to reduce over fitting. This hybrid DeepSVM architecture stabilized training and improved model robustness. Image preprocessing and augmentation techniques were applied to increase variability and prevent over fitting.</p><p><strong>Results: </strong>The final model was evaluated on a separate test set of 400 images, and the results remained stable across repeated runs. DeepSVM achieved an accuracy of 94.50% and an F1-score of 94.47%, outperforming other well-known models like VGG-16, InceptionResNetv2, and MobileNet-V2.</p><p><strong>Conclusion: </strong>These results indicate that the proposed DeepSVM approach offers better generalization and training stability than conventional CNN classifiers, potentially aiding in automated disease monitoring for precision agriculture.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1558358"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12813105/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1