首页 > 最新文献

Frontiers in Artificial Intelligence最新文献

英文 中文
Explainable AI-driven MRI-based brain tumor classification: a novel deep learning approach. 可解释的人工智能驱动的基于mri的脑肿瘤分类:一种新的深度学习方法。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-08 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1700214
Vinayaka R Srinivas, Ramasubramanian Parvathi

Introduction: Brain tumors are among the most aggressive forms of cancer, requiring precise diagnosis and treatment planning to improve patient outcomes. This study aims to develop an efficient deep learning-based framework for the classification of brain tumors using MRI data.

Methods: The methodology employs Convolutional Neural Networks (CNNs) to accurately classify tumors into four categories: normal, glioma, pituitary, and meningioma. Key preprocessing techniques, including noise reduction,resizing, and data augmentation, were applied to enhance the robustness of the model. Advanced architectures such as DenseNet50, VGG19, and other transfer learning models, along with CNN variants, were trained and evaluated for their performance. Explainable AI (XAI) techniques, including Grad-CAM, LIME, and feature map visualizations, played a crucial role in providing better visualizations of the model's decision-making process and identifying areas of improvement during model training and to establish a better model.

Results: The best-performing model, a 4-conv-1-dense-1-dropout CNN, achieved a classification accuracy of 95.86%, outperforming deeper architectures and transfer learning approaches. The findings underscore the potential of deep learning models for reliable and efficient brain tumor classification. This work concludes with recommendations for real-time deployment in clinical settings and explores future integration with Large Language Models (LLMs) to generate detailed diagnostic reports.

脑肿瘤是最具侵袭性的癌症之一,需要精确的诊断和治疗计划来改善患者的预后。本研究旨在开发一种高效的基于深度学习的框架,用于使用MRI数据对脑肿瘤进行分类。方法:采用卷积神经网络(cnn)将肿瘤准确分为正常、胶质瘤、垂体瘤和脑膜瘤4类。采用关键的预处理技术,包括降噪、调整大小和数据增强,以增强模型的鲁棒性。高级架构,如DenseNet50、VGG19和其他迁移学习模型,以及CNN的变体,被训练并评估了它们的性能。可解释的人工智能(XAI)技术,包括Grad-CAM、LIME和特征图可视化,在提供模型决策过程的更好的可视化和识别模型训练期间改进的领域以及建立更好的模型方面发挥了至关重要的作用。结果:表现最好的模型是4- convo -1-dense-1-dropout CNN,其分类准确率达到95.86%,优于更深层次的架构和迁移学习方法。这些发现强调了深度学习模型在可靠和有效的脑肿瘤分类方面的潜力。这项工作总结了在临床环境中实时部署的建议,并探索了未来与大型语言模型(llm)的集成,以生成详细的诊断报告。
{"title":"Explainable AI-driven MRI-based brain tumor classification: a novel deep learning approach.","authors":"Vinayaka R Srinivas, Ramasubramanian Parvathi","doi":"10.3389/frai.2025.1700214","DOIUrl":"10.3389/frai.2025.1700214","url":null,"abstract":"<p><strong>Introduction: </strong>Brain tumors are among the most aggressive forms of cancer, requiring precise diagnosis and treatment planning to improve patient outcomes. This study aims to develop an efficient deep learning-based framework for the classification of brain tumors using MRI data.</p><p><strong>Methods: </strong>The methodology employs Convolutional Neural Networks (CNNs) to accurately classify tumors into four categories: normal, glioma, pituitary, and meningioma. Key preprocessing techniques, including noise reduction,resizing, and data augmentation, were applied to enhance the robustness of the model. Advanced architectures such as DenseNet50, VGG19, and other transfer learning models, along with CNN variants, were trained and evaluated for their performance. Explainable AI (XAI) techniques, including Grad-CAM, LIME, and feature map visualizations, played a crucial role in providing better visualizations of the model's decision-making process and identifying areas of improvement during model training and to establish a better model.</p><p><strong>Results: </strong>The best-performing model, a 4-conv-1-dense-1-dropout CNN, achieved a classification accuracy of 95.86%, outperforming deeper architectures and transfer learning approaches. The findings underscore the potential of deep learning models for reliable and efficient brain tumor classification. This work concludes with recommendations for real-time deployment in clinical settings and explores future integration with Large Language Models (LLMs) to generate detailed diagnostic reports.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1700214"},"PeriodicalIF":4.7,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823918/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146054048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Demographic identification of Greater Caribbean manatees via acoustic feature learning. 基于声学特征学习的大加勒比海牛人口特征识别。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-08 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1660388
Fernando Merchan, Kenji Contreras, Héctor Poveda, Rocío M Estévez, Hector M Guzman, Javier E Sanchez-Galan
<p><p>Demographic inference from vocalizations is essential for monitoring endangered Greater Caribbean manatees (<i>Trichechus manatus manatus</i>) in tropical environments where direct observation is limited. While passive acoustic monitoring has proven effective for manatee detection and individual identification, the ability to classify sex and age from vocalizations remains unexplored, limiting ecological insights into population structure and reproductive dynamics. We investigated whether machine learning can accurately classify sex and age from manatee acoustic signals using 1,285 vocalizations from 20 wild individuals captured in the Changuinola River, Panama. Acoustic features including spectral envelope descriptors (MFCCs), harmonic content (chroma), and temporal-frequency parameters were extracted and analyzed using two feature sets: SET1 (30 spectral-cepstral features) and SET2 (38 features augmented with explicit pitch and temporal descriptors). Four classification algorithms (Random Forest, XGBoost, SVM, LDA) were trained under Leave-One-Group-Out cross-validation with SMOTE oversampling to address class imbalance. Sex classification achieved 85%-87% accuracy (75%-78% macro-F1) with balanced performance across both classes (female: 86%, male: 79%), validating operational feasibility for passive monitoring applications. However, subject-level bootstrap analysis revealed substantial individual heterogeneity (female: 95% CI: 68.7%-96.4%, male: 75.1%-83.6%), indicating that approximately 10%-15% of individuals exhibit systematic misclassification due to atypical acoustic signatures. Spectral envelope characteristics (MFCCs, spectral skewness) rather than fundamental frequency were most discriminative, suggesting sex-related variation manifests in vocal tract resonance patterns. Age classification achieved 73%-85% global accuracy but exhibited severe juvenile under-detection (14%-26% recall), with bootstrap confidence intervals spanning 9.3%-86.3% for juveniles vs. 60.7%-84.7% for adults. Dimensionality reduction (PCA, t-SNE) revealed substantial overlap between juvenile and adult acoustic feature distributions, with clearer age structure visible primarily within female clusters, contributing to systematic misclassification of male juveniles. Threshold optimization improved juvenile recall to 63% but increased false positives to 37%, presenting trade-offs for conservation surveillance. Acoustic body size regression demonstrated promising continuous estimation (MAE = 0.208 m, <i>R</i> <sup>2</sup> = 0.33), offering an alternative to categorical age classification by enabling coarse demographic profiling when integrated with sex inference. These findings establish the operational viability of acoustic sex classification for manatee conservation while highlighting fundamental challenges in categorical age inference due to continuous ontogenetic variation and limited juvenile samples. However, acoustic body size regression offers a promising
在直接观察有限的热带环境中,从发声中得出的人口统计学推断对于监测濒临灭绝的大加勒比海牛(trichecchus manatus manatus)至关重要。虽然被动声学监测已被证明对海牛的探测和个体识别是有效的,但从发声中分类性别和年龄的能力仍未被探索,这限制了对种群结构和生殖动态的生态学见解。我们研究了机器学习是否可以从海牛的声音信号中准确地分类性别和年龄,使用了在巴拿马Changuinola河捕获的20只野生海牛的1,285种发声信号。声学特征包括频谱包络描述符(MFCCs)、谐波含量(色度)和时间频率参数,使用两个特征集:SET1(30个频谱倒谱特征)和SET2(38个特征增强了明确的音高和时间描述符)进行提取和分析。在Leave-One-Group-Out交叉验证和SMOTE过采样下,训练了4种分类算法(Random Forest、XGBoost、SVM、LDA)来解决分类不平衡问题。性别分类的准确率达到85%-87%(宏观f1为75%-78%),在两个类别(女性:86%,男性:79%)中表现平衡,验证了被动监测应用的操作可行性。然而,受试者水平的bootstrap分析显示了大量的个体异质性(女性:95% CI: 68.7%-96.4%,男性:75.1%-83.6%),表明大约10%-15%的个体由于非典型声学特征而表现出系统性的错误分类。谱包络特征(MFCCs,谱偏度)比基频更具歧视性,表明性别相关的变异表现在声道共振模式中。年龄分类达到了73%-85%的全球准确率,但表现出严重的青少年检测不足(14%-26%的召回率),青少年的自举置信区间为9.3%-86.3%,成人为60.7%-84.7%。降维分析(PCA, t-SNE)显示,幼鱼和成年鱼的声学特征分布存在明显的重叠,年龄结构更清晰,主要在雌性群集中可见,这导致了雄性幼鱼的系统误分类。阈值优化将幼鱼的召回率提高到63%,但将误报率提高到37%,为保护监测提供了折衷方案。声学体型回归显示出有希望的连续估计(MAE = 0.208 m, r2 = 0.33),通过与性别推断相结合的粗略人口统计分析,提供了分类年龄分类的另一种选择。这些发现确立了声学性别分类对海牛保护的可行性,同时强调了由于持续的个体发生变化和有限的幼崽样本,在分类年龄推断方面面临的基本挑战。然而,声学体型回归提供了一种很有前途的补充方法,可以实现跨体型类别而不是离散年龄类别的连续人口统计分析。与已建立的个体识别框架相结合,可以实现全面的声学标记重新捕获,同时从长期水听器部署中估计丰度、性别比例、大小分布和人口结构,而无需视觉确认身体尺寸。
{"title":"Demographic identification of Greater Caribbean manatees via acoustic feature learning.","authors":"Fernando Merchan, Kenji Contreras, Héctor Poveda, Rocío M Estévez, Hector M Guzman, Javier E Sanchez-Galan","doi":"10.3389/frai.2025.1660388","DOIUrl":"10.3389/frai.2025.1660388","url":null,"abstract":"&lt;p&gt;&lt;p&gt;Demographic inference from vocalizations is essential for monitoring endangered Greater Caribbean manatees (&lt;i&gt;Trichechus manatus manatus&lt;/i&gt;) in tropical environments where direct observation is limited. While passive acoustic monitoring has proven effective for manatee detection and individual identification, the ability to classify sex and age from vocalizations remains unexplored, limiting ecological insights into population structure and reproductive dynamics. We investigated whether machine learning can accurately classify sex and age from manatee acoustic signals using 1,285 vocalizations from 20 wild individuals captured in the Changuinola River, Panama. Acoustic features including spectral envelope descriptors (MFCCs), harmonic content (chroma), and temporal-frequency parameters were extracted and analyzed using two feature sets: SET1 (30 spectral-cepstral features) and SET2 (38 features augmented with explicit pitch and temporal descriptors). Four classification algorithms (Random Forest, XGBoost, SVM, LDA) were trained under Leave-One-Group-Out cross-validation with SMOTE oversampling to address class imbalance. Sex classification achieved 85%-87% accuracy (75%-78% macro-F1) with balanced performance across both classes (female: 86%, male: 79%), validating operational feasibility for passive monitoring applications. However, subject-level bootstrap analysis revealed substantial individual heterogeneity (female: 95% CI: 68.7%-96.4%, male: 75.1%-83.6%), indicating that approximately 10%-15% of individuals exhibit systematic misclassification due to atypical acoustic signatures. Spectral envelope characteristics (MFCCs, spectral skewness) rather than fundamental frequency were most discriminative, suggesting sex-related variation manifests in vocal tract resonance patterns. Age classification achieved 73%-85% global accuracy but exhibited severe juvenile under-detection (14%-26% recall), with bootstrap confidence intervals spanning 9.3%-86.3% for juveniles vs. 60.7%-84.7% for adults. Dimensionality reduction (PCA, t-SNE) revealed substantial overlap between juvenile and adult acoustic feature distributions, with clearer age structure visible primarily within female clusters, contributing to systematic misclassification of male juveniles. Threshold optimization improved juvenile recall to 63% but increased false positives to 37%, presenting trade-offs for conservation surveillance. Acoustic body size regression demonstrated promising continuous estimation (MAE = 0.208 m, &lt;i&gt;R&lt;/i&gt; &lt;sup&gt;2&lt;/sup&gt; = 0.33), offering an alternative to categorical age classification by enabling coarse demographic profiling when integrated with sex inference. These findings establish the operational viability of acoustic sex classification for manatee conservation while highlighting fundamental challenges in categorical age inference due to continuous ontogenetic variation and limited juvenile samples. However, acoustic body size regression offers a promising","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1660388"},"PeriodicalIF":4.7,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823911/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146054068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Machine learning-based detection of cognitive decline using SSWTRT: classification performance and decision analysis. 更正:使用SSWTRT的基于机器学习的认知衰退检测:分类性能和决策分析。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-08 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1764066
Yuji Nozaki, Chihiro Kamohara, Ryota Abe, Taiki Ieda, Madoka Nakajima, Maki Sakamoto

[This corrects the article DOI: 10.3389/frai.2025.1689182.].

[这更正了文章DOI: 10.3389/frai.2025.1689182.]。
{"title":"Correction: Machine learning-based detection of cognitive decline using SSWTRT: classification performance and decision analysis.","authors":"Yuji Nozaki, Chihiro Kamohara, Ryota Abe, Taiki Ieda, Madoka Nakajima, Maki Sakamoto","doi":"10.3389/frai.2025.1764066","DOIUrl":"https://doi.org/10.3389/frai.2025.1764066","url":null,"abstract":"<p><p>[This corrects the article DOI: 10.3389/frai.2025.1689182.].</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1764066"},"PeriodicalIF":4.7,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12824601/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146054011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-modal AI in precision medicine: integrating genomics, imaging, and EHR data for clinical insights. 精准医疗中的多模式人工智能:整合基因组学、成像和电子病历数据以获得临床见解。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-07 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1743921
Shahper Nazeer Khan, Danishuddin, Mohd Wajid Ali Khan, Luca Guarnera, Syed Mohammad Fauzan Akhtar

Precision healthcare is increasingly oriented toward the development of therapeutic strategies that are as individualized as the patients receiving them. Central to this paradigm shift is artificial intelligence (AI)-enabled multi-modal data integration, which consolidates heterogeneous data streams-including genomic, transcriptomic, proteomic, imaging, environmental, and electronic health record (EHR) data into a unified analytical framework. This integrative approach enhances early disease detection, facilitates the discovery of clinically actionable biomarkers, and accelerates rational drug development, with particularly significant implications for oncology, neurology, and cardiovascular medicine. Advanced machine learning (ML) and deep learning (DL) algorithms are capable of extracting complex, non-linear associations across data modalities, thereby improving diagnostic precision, enabling robust risk stratification, and informing patient-specific therapeutic interventions. Furthermore, AI-driven applications in digital health, such as wearable biosensors and real-time physiological monitoring, allow for continuous, dynamic refinement of treatment plans. This review examines the transformative potential of multi-modal AI in precision medicine, with emphasis on its role in multi-omics data integration, predictive modeling, and clinical decision support. In parallel, it critically evaluates prevailing challenges, including data interoperability, algorithmic bias, and ethical considerations surrounding patient privacy. The synergistic convergence of AI and multi-modal data represents not merely a technological innovation but a fundamental redefinition of individualized healthcare delivery.

精准医疗越来越趋向于治疗策略的发展,这些治疗策略与接受治疗的患者一样个性化。这种模式转变的核心是支持人工智能(AI)的多模式数据集成,它将异构数据流(包括基因组、转录组、蛋白质组、成像、环境和电子健康记录(EHR)数据整合到统一的分析框架中。这种综合方法增强了疾病的早期检测,促进了临床可操作生物标志物的发现,并加速了合理的药物开发,对肿瘤、神经病学和心血管医学具有特别重要的意义。先进的机器学习(ML)和深度学习(DL)算法能够从数据模式中提取复杂的非线性关联,从而提高诊断精度,实现稳健的风险分层,并为患者特定的治疗干预提供信息。此外,人工智能驱动的数字健康应用,如可穿戴生物传感器和实时生理监测,允许持续、动态地改进治疗计划。本文综述了多模态人工智能在精准医学中的变革潜力,重点介绍了其在多组学数据集成、预测建模和临床决策支持方面的作用。同时,它批判性地评估当前的挑战,包括数据互操作性、算法偏见和围绕患者隐私的伦理考虑。人工智能和多模态数据的协同融合不仅代表了一种技术创新,而且代表了对个性化医疗服务的根本重新定义。
{"title":"Multi-modal AI in precision medicine: integrating genomics, imaging, and EHR data for clinical insights.","authors":"Shahper Nazeer Khan, Danishuddin, Mohd Wajid Ali Khan, Luca Guarnera, Syed Mohammad Fauzan Akhtar","doi":"10.3389/frai.2025.1743921","DOIUrl":"10.3389/frai.2025.1743921","url":null,"abstract":"<p><p>Precision healthcare is increasingly oriented toward the development of therapeutic strategies that are as individualized as the patients receiving them. Central to this paradigm shift is artificial intelligence (AI)-enabled multi-modal data integration, which consolidates heterogeneous data streams-including genomic, transcriptomic, proteomic, imaging, environmental, and electronic health record (EHR) data into a unified analytical framework. This integrative approach enhances early disease detection, facilitates the discovery of clinically actionable biomarkers, and accelerates rational drug development, with particularly significant implications for oncology, neurology, and cardiovascular medicine. Advanced machine learning (ML) and deep learning (DL) algorithms are capable of extracting complex, non-linear associations across data modalities, thereby improving diagnostic precision, enabling robust risk stratification, and informing patient-specific therapeutic interventions. Furthermore, AI-driven applications in digital health, such as wearable biosensors and real-time physiological monitoring, allow for continuous, dynamic refinement of treatment plans. This review examines the transformative potential of multi-modal AI in precision medicine, with emphasis on its role in multi-omics data integration, predictive modeling, and clinical decision support. In parallel, it critically evaluates prevailing challenges, including data interoperability, algorithmic bias, and ethical considerations surrounding patient privacy. The synergistic convergence of AI and multi-modal data represents not merely a technological innovation but a fundamental redefinition of individualized healthcare delivery.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1743921"},"PeriodicalIF":4.7,"publicationDate":"2026-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12819606/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146030996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Use of machine learning models to predict mechanical ventilation, ECMO, and mortality in COVID-19. 使用机器学习模型预测COVID-19患者的机械通气、ECMO和死亡率。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1661637
Nina Moorman, Erin Hedlund-Botti, Grace Gombolay, Matthew C Gombolay

Introduction: Patients with severe COVID-19 may require MV or ECMO. Predicting who will require interventions and the duration of those interventions are challenging due to the diverse responses among patients and the dynamic nature of the disease. As such, there is a need for better prediction of the duration and outcomes of MV use in patients, to improve patient care and aid with MV and ECMO allocation. Here we develop and examine the performance of ML models to predict MV duration, ECMO, and mortality for patients with COVID-19.

Methods: In this retrospective prognostic study, hierarchical machine-learning models were developed to predict MV duration and outcome prediction from demographic data and time-series data consisting of vital signs and laboratory results. We train our models on 10,378 patients with positive severe acute respiratory syndrome-related coronavirus (SARS-CoV-2) virus testing from Emory's COVID CRADLE Dataset who sought treatment at Emory University Hospital between February 28, 2020, to January 24, 2022. Analysis was conducted between January 10, 2022, and April 5, 2024. The main outcomes and measures were the AUROC, AUPRC and the F-score for MV duration, need for ECMO, and mortality prediction.

Results: Data from 10,378 patients with COVID-19 (median [IQR] age, 60 [48-72] years; 5,281 [50.89%] women) were included. Overall MV class distributions for 0 days, 1-4 days, 5-9 days, 10-14 days, 15-19 days, 20-24 days, 25-29 days, and ≥30 days of MV were 8,141 (78.44%), 812 (7.82%), 325 (3.13%), 241 (2.32%), 153 (1.47%), 97 (0.93%), 87 (0.84%), and 522 (5.03%), respectively. Overall ECMO use and mortality rates were 15 (0.14%) and 1,114 (10.73%), respectively. On MV duration, ECMO use, and mortality outcomes, the highest-performing model reached weighted average AUROC scores of 0.873, 0.902, and 0.774, and the highest-performing model reached weighted average AUPRC scores of 0.790, 0.999, and 0.893.

Conclusions and relevance: Hierarchical ML models trained on vital signs, laboratory results, and demographic data show promise for the prediction of MV duration, ECMO use, and mortality in COVID-19 patients.

重症COVID-19患者可能需要MV或ECMO。由于患者的不同反应和疾病的动态性质,预测谁将需要干预以及这些干预的持续时间具有挑战性。因此,有必要更好地预测患者使用MV的持续时间和结果,以改善患者护理并协助MV和ECMO的分配。在这里,我们开发并检验了ML模型的性能,以预测COVID-19患者的MV持续时间、ECMO和死亡率。方法:在这项回顾性预后研究中,开发了分层机器学习模型,根据人口统计数据和由生命体征和实验室结果组成的时间序列数据预测MV持续时间和结局预测。我们对10378名严重急性呼吸综合征相关冠状病毒(SARS-CoV-2)病毒检测呈阳性的患者进行了模型训练,这些患者来自埃默里大学的COVID - CRADLE数据集,他们在2020年2月28日至2022年1月24日期间在埃默里大学医院寻求治疗。分析时间为2022年1月10日至2024年4月5日。主要结果和指标为AUROC、AUPRC和MV持续时间f评分、ECMO需求和死亡率预测。结果:纳入10378例COVID-19患者的数据(中位[IQR]年龄为60[48-72]岁;5281例[50.89%]女性)。0天、1-4天、5-9天、10-14天、15-19天、20-24天、25-29天和≥30天的总MV级分布分别为8141(78.44%)、812(7.82%)、325(3.13%)、241(2.32%)、153(1.47%)、97(0.93%)、87(0.84%)和522(5.03%)。总体ECMO使用率和死亡率分别为15例(0.14%)和1114例(10.73%)。在MV持续时间、ECMO使用和死亡率结果方面,表现最好的模型AUROC加权平均得分分别为0.873、0.902和0.774,表现最好的模型AUPRC加权平均得分分别为0.790、0.999和0.893。结论和相关性:基于生命体征、实验室结果和人口统计学数据训练的分层机器学习模型有望预测COVID-19患者的MV持续时间、ECMO使用和死亡率。
{"title":"Use of machine learning models to predict mechanical ventilation, ECMO, and mortality in COVID-19.","authors":"Nina Moorman, Erin Hedlund-Botti, Grace Gombolay, Matthew C Gombolay","doi":"10.3389/frai.2025.1661637","DOIUrl":"10.3389/frai.2025.1661637","url":null,"abstract":"<p><strong>Introduction: </strong>Patients with severe COVID-19 may require MV or ECMO. Predicting who will require interventions and the duration of those interventions are challenging due to the diverse responses among patients and the dynamic nature of the disease. As such, there is a need for better prediction of the duration and outcomes of MV use in patients, to improve patient care and aid with MV and ECMO allocation. Here we develop and examine the performance of ML models to predict MV duration, ECMO, and mortality for patients with COVID-19.</p><p><strong>Methods: </strong>In this retrospective prognostic study, hierarchical machine-learning models were developed to predict MV duration and outcome prediction from demographic data and time-series data consisting of vital signs and laboratory results. We train our models on 10,378 patients with positive severe acute respiratory syndrome-related coronavirus (SARS-CoV-2) virus testing from Emory's COVID CRADLE Dataset who sought treatment at Emory University Hospital between February 28, 2020, to January 24, 2022. Analysis was conducted between January 10, 2022, and April 5, 2024. The main outcomes and measures were the AUROC, AUPRC and the F-score for MV duration, need for ECMO, and mortality prediction.</p><p><strong>Results: </strong>Data from 10,378 patients with COVID-19 (median [IQR] age, 60 [48-72] years; 5,281 [50.89%] women) were included. Overall MV class distributions for 0 days, 1-4 days, 5-9 days, 10-14 days, 15-19 days, 20-24 days, 25-29 days, and ≥30 days of MV were 8,141 (78.44%), 812 (7.82%), 325 (3.13%), 241 (2.32%), 153 (1.47%), 97 (0.93%), 87 (0.84%), and 522 (5.03%), respectively. Overall ECMO use and mortality rates were 15 (0.14%) and 1,114 (10.73%), respectively. On MV duration, ECMO use, and mortality outcomes, the highest-performing model reached weighted average AUROC scores of 0.873, 0.902, and 0.774, and the highest-performing model reached weighted average AUPRC scores of 0.790, 0.999, and 0.893.</p><p><strong>Conclusions and relevance: </strong>Hierarchical ML models trained on vital signs, laboratory results, and demographic data show promise for the prediction of MV duration, ECMO use, and mortality in COVID-19 patients.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1661637"},"PeriodicalIF":4.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12816323/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding user perceptions of DeepSeek: insights from sentiment, topic and network analysis using a Reddit-based study. 了解用户对DeepSeek的看法:使用基于reddit的研究,从情感、主题和网络分析中获得见解。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1703949
Naisarg Patel, Rajesh Sharma, Prakash Lingasamy, Vino Sundararajan, Sajitha Lulu Sudhakaran, Vijayachitra Modhukur

Introduction: The launch of DeepSeek, a Chinese open-source generative AI model, generated substantial discussion regarding its capabilities and implications. The r/deepseek subreddit emerged as a key forum for real-time public evaluation. Analyzing this discourse is essential for understanding the sociotechnical perceptions shaping the integration of emerging AI systems.

Methods: We analyzed 46,649 posts and comments from r/deepseek (January-May 2025) using a computational framework combining VADER sentiment analysis, Hartmann emotion classification, BERTopic for thematic modeling, hyperlink extraction, and directed network analysis. Data preprocessing included cleaning, normalization, and lemmatization. We also examined correlations between sentiment/emotion scores and dominant topics.

Results: Sentiment was predominantly positive (posts: 47.23%; comments: 44.26%), with neutral sentiment comprising ~30% of content. The most frequent emotion was neutrality, followed by surprise and fear, indicating ambivalent user reactions. Prominent topics included open-source AI models, DeepSeek usage, device compatibility, comparisons with ChatGPT, and censorship concerns. Hyperlink analysis indicated strong engagement with GitHub, Hugging Face, and DeepSeek's own services. Network analysis revealed a fragmented but active community, depicting Open-Source AI Models as the most cohesive cluster.

Discussion: Community discourse framed DeepSeek as both a technical tool and a geopolitical issue. Enthusiasm centered on its performance, accessibility, and open-source nature, while concerns were voiced about censorship, data privacy, and potential ideological influence. The integrated analysis shows that collective perception emerged through decentralized, dialogic engagement, reflecting broader sociotechnical tensions related to openness, trust, and legitimacy in global AI development.

导读:中国开源生成人工智能模型DeepSeek的推出,引发了关于其功能和影响的大量讨论。reddit的r/deepseek子论坛成为实时公众评估的关键论坛。分析这一论述对于理解塑造新兴人工智能系统集成的社会技术观念至关重要。方法:利用VADER情感分析、Hartmann情感分类、BERTopic主题建模、超链接提取和定向网络分析相结合的计算框架,对r/deepseek(2025年1 - 5月)上的46649篇帖子和评论进行分析。数据预处理包括清理、规范化和归纳。我们还研究了情绪/情绪得分与主导话题之间的相关性。结果:情绪主要是积极的(帖子:47.23%;评论:44.26%),中性情绪约占内容的30%。最常见的情绪是中立,其次是惊讶和恐惧,表明用户的反应是矛盾的。突出的话题包括开源人工智能模型、DeepSeek的使用、设备兼容性、与ChatGPT的比较以及审查问题。超链接分析表明,该公司与GitHub、hug Face和DeepSeek自己的服务有着密切的联系。网络分析揭示了一个分散但活跃的社区,将开源人工智能模型描述为最具凝聚力的集群。讨论:社区讨论将DeepSeek视为技术工具和地缘政治问题。人们的热情集中在它的性能、可访问性和开源性质上,同时也表达了对审查、数据隐私和潜在意识形态影响的担忧。综合分析表明,集体感知是通过分散的对话参与产生的,反映了全球人工智能发展中与开放性、信任和合法性相关的更广泛的社会技术紧张关系。
{"title":"Understanding user perceptions of DeepSeek: insights from sentiment, topic and network analysis using a Reddit-based study.","authors":"Naisarg Patel, Rajesh Sharma, Prakash Lingasamy, Vino Sundararajan, Sajitha Lulu Sudhakaran, Vijayachitra Modhukur","doi":"10.3389/frai.2025.1703949","DOIUrl":"10.3389/frai.2025.1703949","url":null,"abstract":"<p><strong>Introduction: </strong>The launch of DeepSeek, a Chinese open-source generative AI model, generated substantial discussion regarding its capabilities and implications. The r/deepseek subreddit emerged as a key forum for real-time public evaluation. Analyzing this discourse is essential for understanding the sociotechnical perceptions shaping the integration of emerging AI systems.</p><p><strong>Methods: </strong>We analyzed 46,649 posts and comments from r/deepseek (January-May 2025) using a computational framework combining VADER sentiment analysis, Hartmann emotion classification, BERTopic for thematic modeling, hyperlink extraction, and directed network analysis. Data preprocessing included cleaning, normalization, and lemmatization. We also examined correlations between sentiment/emotion scores and dominant topics.</p><p><strong>Results: </strong>Sentiment was predominantly positive (posts: 47.23%; comments: 44.26%), with neutral sentiment comprising ~30% of content. The most frequent emotion was neutrality, followed by surprise and fear, indicating ambivalent user reactions. Prominent topics included open-source AI models, DeepSeek usage, device compatibility, comparisons with ChatGPT, and censorship concerns. Hyperlink analysis indicated strong engagement with GitHub, Hugging Face, and DeepSeek's own services. Network analysis revealed a fragmented but active community, depicting Open-Source AI Models as the most cohesive cluster.</p><p><strong>Discussion: </strong>Community discourse framed DeepSeek as both a technical tool and a geopolitical issue. Enthusiasm centered on its performance, accessibility, and open-source nature, while concerns were voiced about censorship, data privacy, and potential ideological influence. The integrated analysis shows that collective perception emerged through decentralized, dialogic engagement, reflecting broader sociotechnical tensions related to openness, trust, and legitimacy in global AI development.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1703949"},"PeriodicalIF":4.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12816320/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tracing strategic divergence: archetypal and counterfactual analysis of StarCraft II gameplay trajectories. 追踪战略分歧:《星际争霸2》玩法轨迹的原型和反事实分析
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-06 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1724493
Jie Zhang, Weilong Yang

Introduction: To address the challenges of data heterogeneity, strategic diversity, and process opacity in interpreting multi-agent decision-making within complex competitive environments, we have developed TRACE, an end-to-end analytical framework for StarCraft II gameplay.

Methods: This framework standardizes raw replay data into aligned state trajectories, extracts "typical strategic progressions" using a Conditional Recurrent Variational Autoencoder (C-RVAE), and quantifies the deviation of individual games from these archetypes via counterfactual alignment. Its core innovation is the introduction of a dimensionless deviation metric, |Δ|, which achieves process-level interpretability. This metric reveals "which elements are important" by ranking time-averaged feature contributions across aggregated categories (Economy, Military, Technology) and shows "when deviations occur" through temporal heatmaps, forging a verifiable evidence chain..

Results: Quantitative evaluation on professional tournament datasets demonstrates the framework's robustness, revealing that strategic deviations often crystallize in the early game (averaging 8.4% of match duration) and are frequently driven by critical technology timing gaps. The counterfactual generation module effectively restores strategic alignment, achieving an average similarity improvement of over 90% by correcting identified divergences. Furthermore, expert human evaluation confirms the practical utility of the system, awarding high scores for Factual Fidelity (4.6/5.0) and Causal Coherence (4.3/5.0) to the automatically generated narratives.

Discussion: By providing openaccess code and reproducible datasets, TRACE lowers the barrier to large-scale replay analysis, offering an operational quantitative basis for macro-strategy understanding, coaching reviews, and AI model evaluation.

为了解决在复杂的竞争环境中解释多智能体决策的数据异质性、战略多样性和过程不透明性的挑战,我们开发了TRACE,这是一个用于星际争霸II游戏玩法的端到端分析框架。方法:该框架将原始重放数据标准化为对齐状态轨迹,使用条件递归变分自编码器(C-RVAE)提取“典型策略进展”,并通过反事实对齐量化单个游戏与这些原型的偏差。其核心创新是引入了无量纲偏差度量|Δ|,实现了过程级的可解释性。该指标通过对综合类别(经济、军事、技术)的时间平均特征贡献进行排名,揭示了“哪些元素是重要的”,并通过时间热图显示了“偏差何时发生”,形成了一个可验证的证据链。对职业比赛数据集的定量评估证明了该框架的稳健性,揭示了战略偏差通常在比赛早期(平均为比赛持续时间的8.4%)明确,并且经常由关键技术时间差距驱动。反事实生成模块有效地恢复了战略一致性,通过纠正已识别的差异,实现了超过90%的平均相似性改进。此外,专家评估证实了该系统的实用性,自动生成的叙述在事实保真度(4.6/5.0)和因果一致性(4.3/5.0)方面获得了高分。讨论:通过提供开放访问代码和可重复的数据集,TRACE降低了大规模重播分析的障碍,为宏观战略理解、指导审查和AI模型评估提供了可操作的定量基础。
{"title":"Tracing strategic divergence: archetypal and counterfactual analysis of StarCraft II gameplay trajectories.","authors":"Jie Zhang, Weilong Yang","doi":"10.3389/frai.2025.1724493","DOIUrl":"10.3389/frai.2025.1724493","url":null,"abstract":"<p><strong>Introduction: </strong>To address the challenges of data heterogeneity, strategic diversity, and process opacity in interpreting multi-agent decision-making within complex competitive environments, we have developed TRACE, an end-to-end analytical framework for StarCraft II gameplay.</p><p><strong>Methods: </strong>This framework standardizes raw replay data into aligned state trajectories, extracts \"typical strategic progressions\" using a Conditional Recurrent Variational Autoencoder (C-RVAE), and quantifies the deviation of individual games from these archetypes via counterfactual alignment. Its core innovation is the introduction of a dimensionless deviation metric, |Δ|, which achieves process-level interpretability. This metric reveals \"which elements are important\" by ranking time-averaged feature contributions across aggregated categories (Economy, Military, Technology) and shows \"when deviations occur\" through temporal heatmaps, forging a verifiable evidence chain..</p><p><strong>Results: </strong>Quantitative evaluation on professional tournament datasets demonstrates the framework's robustness, revealing that strategic deviations often crystallize in the early game (averaging 8.4% of match duration) and are frequently driven by critical technology timing gaps. The counterfactual generation module effectively restores strategic alignment, achieving an average similarity improvement of over 90% by correcting identified divergences. Furthermore, expert human evaluation confirms the practical utility of the system, awarding high scores for Factual Fidelity (4.6/5.0) and Causal Coherence (4.3/5.0) to the automatically generated narratives.</p><p><strong>Discussion: </strong>By providing openaccess code and reproducible datasets, TRACE lowers the barrier to large-scale replay analysis, offering an operational quantitative basis for macro-strategy understanding, coaching reviews, and AI model evaluation.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1724493"},"PeriodicalIF":4.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12816306/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146019791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing intelligent chatbots with ChatGPT: a framework for development and implementation. 使用ChatGPT设计智能聊天机器人:用于开发和实现的框架。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1618791
Sajjad Hyder, Javeed Kittur

Background: The rapid evolution of interactive AI has reshaped human-computer interaction, with ChatGPT emerging as a key tool for chatbot development. Industries such as healthcare, customer service, and education increasingly integrate chatbots, highlighting the need for a structured development framework.

Purpose: This study proposes a framework for designing intelligent chatbots using ChatGPT, focusing on user experience, hybrid design models, prompt engineering, and system limitations. The framework aims to bridge the gap between technical innovation and real-world application.

Methods: A systematic literature review (SLR) was conducted, analyzing 40 relevant studies. The research was structured around three key questions: (1) How do user experience and engagement influence chatbot performance? (2) How do hybrid design models improve chatbot performance? (3) What are the limitations of using ChatGPT, and how does prompt engineering affect responses?

Results: The findings emphasize that well-designed user interactions enhance engagement and trust. Hybrid models integrating rule-based and machine learning techniques improve chatbot functionality. However, challenges such as response inconsistencies, ethical concerns, and prompt sensitivity require careful consideration. A framework for design, development, and implementation of effective Chatbots with ChatGPT has been proposed in this study.

Conclusion: This study provides a structured framework for chatbot development with ChatGPT, offering insights into optimizing user experience, leveraging hybrid design, and mitigating limitations. The proposed framework serves as a practical guide for researchers, developers, and businesses aiming to create intelligent, user-centric chatbot solutions.

背景:交互式人工智能的快速发展重塑了人机交互,ChatGPT成为聊天机器人开发的关键工具。医疗保健、客户服务和教育等行业越来越多地集成聊天机器人,这凸显了对结构化开发框架的需求。目的:本研究提出了一个使用ChatGPT设计智能聊天机器人的框架,重点关注用户体验、混合设计模型、提示工程和系统限制。该框架旨在弥合技术创新与实际应用之间的差距。方法:采用系统文献复习法(SLR),对40项相关研究进行分析。该研究围绕三个关键问题展开:(1)用户体验和参与度如何影响聊天机器人的性能?(2)混合设计模型如何提高聊天机器人的性能?(3)使用ChatGPT的局限性是什么?提示工程如何影响响应?结果:研究结果强调,设计良好的用户交互可以提高用户粘性和信任度。集成基于规则和机器学习技术的混合模型提高了聊天机器人的功能。然而,诸如响应不一致、伦理问题和迅速敏感性等挑战需要仔细考虑。本研究提出了一个基于ChatGPT的有效聊天机器人的设计、开发和实现框架。结论:本研究为ChatGPT聊天机器人开发提供了一个结构化框架,为优化用户体验、利用混合设计和减轻限制提供了见解。该框架为旨在创建智能、以用户为中心的聊天机器人解决方案的研究人员、开发人员和企业提供了实用指南。
{"title":"Designing intelligent chatbots with ChatGPT: a framework for development and implementation.","authors":"Sajjad Hyder, Javeed Kittur","doi":"10.3389/frai.2025.1618791","DOIUrl":"10.3389/frai.2025.1618791","url":null,"abstract":"<p><strong>Background: </strong>The rapid evolution of interactive AI has reshaped human-computer interaction, with ChatGPT emerging as a key tool for chatbot development. Industries such as healthcare, customer service, and education increasingly integrate chatbots, highlighting the need for a structured development framework.</p><p><strong>Purpose: </strong>This study proposes a framework for designing intelligent chatbots using ChatGPT, focusing on user experience, hybrid design models, prompt engineering, and system limitations. The framework aims to bridge the gap between technical innovation and real-world application.</p><p><strong>Methods: </strong>A systematic literature review (SLR) was conducted, analyzing 40 relevant studies. The research was structured around three key questions: (1) How do user experience and engagement influence chatbot performance? (2) How do hybrid design models improve chatbot performance? (3) What are the limitations of using ChatGPT, and how does prompt engineering affect responses?</p><p><strong>Results: </strong>The findings emphasize that well-designed user interactions enhance engagement and trust. Hybrid models integrating rule-based and machine learning techniques improve chatbot functionality. However, challenges such as response inconsistencies, ethical concerns, and prompt sensitivity require careful consideration. A framework for design, development, and implementation of effective Chatbots with ChatGPT has been proposed in this study.</p><p><strong>Conclusion: </strong>This study provides a structured framework for chatbot development with ChatGPT, offering insights into optimizing user experience, leveraging hybrid design, and mitigating limitations. The proposed framework serves as a practical guide for researchers, developers, and businesses aiming to create intelligent, user-centric chatbot solutions.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1618791"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12812903/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GECOBench: a gender-controlled text dataset and benchmark for quantifying biases in explanations. gecbench:一个性别控制的文本数据集和量化解释偏差的基准。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1694388
Rick Wilming, Artur Dox, Hjalmar Schulz, Marta Oliveira, Benedict Clark, Stefan Haufe

Large pre-trained language models have become a crucial backbone for many downstream tasks in natural language processing (NLP), and while they are trained on a plethora of data containing a variety of biases, such as gender biases, it has been shown that they can also inherit such biases in their weights, potentially affecting their prediction behavior. However, it is unclear to what extent these biases also affect feature attributions generated by applying "explainable artificial intelligence" (XAI) techniques, possibly in unfavorable ways. To systematically study this question, we create a gender-controlled text dataset, GECO, in which the alteration of grammatical gender forms induces class-specific words and provides ground truth feature attributions for gender classification tasks. This enables an objective evaluation of the correctness of XAI methods. We apply this dataset to the pre-trained BERT model, which we fine-tune to different degrees, to quantitatively measure how pre-training induces undesirable bias in feature attributions and to what extent fine-tuning can mitigate such explanation bias. To this extent, we provide GECOBench, a rigorous quantitative evaluation framework for benchmarking popular XAI methods. We show a clear dependency between explanation performance and the number of fine-tuned layers, where XAI methods are observed to benefit particularly from fine-tuning or complete retraining of embedding layers.

大型预训练语言模型已经成为自然语言处理(NLP)中许多下游任务的关键支柱,虽然它们是在包含各种偏差(如性别偏差)的大量数据上训练的,但研究表明,它们也可以在权重中继承这种偏差,从而潜在地影响它们的预测行为。然而,目前尚不清楚这些偏差在多大程度上也会影响应用“可解释的人工智能”(XAI)技术产生的特征归因,可能以不利的方式。为了系统地研究这个问题,我们创建了一个性别控制的文本数据集GECO,其中语法性别形式的变化诱导了特定类别的单词,并为性别分类任务提供了基本真理特征归因。这样就可以客观地评价XAI方法的正确性。我们将该数据集应用于预训练的BERT模型,并对其进行不同程度的微调,以定量测量预训练如何在特征归因中引起不良偏差,以及微调可以在多大程度上减轻这种解释偏差。在这种程度上,我们提供了gecbench,这是一个严格的定量评估框架,用于对流行的XAI方法进行基准测试。我们展示了解释性能与微调层数量之间的明确依赖关系,其中观察到XAI方法特别受益于微调或完全重新训练嵌入层。
{"title":"GECOBench: a gender-controlled text dataset and benchmark for quantifying biases in explanations.","authors":"Rick Wilming, Artur Dox, Hjalmar Schulz, Marta Oliveira, Benedict Clark, Stefan Haufe","doi":"10.3389/frai.2025.1694388","DOIUrl":"10.3389/frai.2025.1694388","url":null,"abstract":"<p><p>Large pre-trained language models have become a crucial backbone for many downstream tasks in natural language processing (NLP), and while they are trained on a plethora of data containing a variety of biases, such as gender biases, it has been shown that they can also inherit such biases in their weights, potentially affecting their prediction behavior. However, it is unclear to what extent these biases also affect feature attributions generated by applying \"explainable artificial intelligence\" (XAI) techniques, possibly in unfavorable ways. To systematically study this question, we create a gender-controlled text dataset, GECO, in which the alteration of grammatical gender forms induces class-specific words and provides ground truth feature attributions for gender classification tasks. This enables an objective evaluation of the correctness of XAI methods. We apply this dataset to the pre-trained BERT model, which we fine-tune to different degrees, to quantitatively measure how pre-training induces undesirable bias in feature attributions and to what extent fine-tuning can mitigate such explanation bias. To this extent, we provide GECOBench, a rigorous quantitative evaluation framework for benchmarking popular XAI methods. We show a clear dependency between explanation performance and the number of fine-tuned layers, where XAI methods are observed to benefit particularly from fine-tuning or complete retraining of embedding layers.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1694388"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12813014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional neural networks and mixture of experts for intrusion detection in 5G networks and beyond. 卷积神经网络和混合专家在5G网络及以后的入侵检测。
IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-05 eCollection Date: 2025-01-01 DOI: 10.3389/frai.2025.1708953
Loukas Ilias, George Doukas, Vangelis Lamprou, Christos Ntanos, Dimitris Askounis

The advent of 6G/NextG networks offers numerous benefits, including extreme capacity, reliability, and efficiency. To mitigate emerging security threats, 6G/NextG networks incorporate advanced artificial intelligence algorithms. However, existing studies on intrusion detection predominantly rely on deep neural networks with static components that are not conditionally dependent on the input, thereby limiting their representational power and efficiency. To address these issues, we present the first study to integrate a Mixture of Experts (MoE) architecture for the identification of malicious traffic. Specifically, we use network traffic data and convert the 1D feature array into a 2D matrix. Next, we pass this matrix through a convolutional neural network (CNN) layer, followed by batch normalization and max pooling layers. Subsequently, a sparsely gated MoE layer is used. This layer consists of a set of expert networks (dense layers) and a router that assigns weights to each expert's output. Sparsity is achieved by selecting only the most relevant experts from the full set. Finally, we conduct a series of ablation experiments to demonstrate the effectiveness of our proposed model. Experiments are conducted on the 5G-NIDD dataset, a network intrusion detection dataset generated from a real 5G test network, and the NANCY dataset, which includes cyberattacks from the O-RAN 5G Testbed Dataset. The results show that our introduced approach achieves accuracies of up to 99.96% and 79.59% on the 5G-NIDD and NANCY datasets, respectively. The findings also show that our proposed model offers multiple advantages over state-of-the-art approaches.

6G/NextG网络的出现带来了许多好处,包括极高的容量、可靠性和效率。为了缓解新出现的安全威胁,6G/NextG网络采用了先进的人工智能算法。然而,现有的入侵检测研究主要依赖于具有静态组件的深度神经网络,这些静态组件不依赖于输入,从而限制了它们的表示能力和效率。为了解决这些问题,我们提出了第一个集成混合专家(MoE)架构来识别恶意流量的研究。具体来说,我们使用网络流量数据并将一维特征数组转换为二维矩阵。接下来,我们通过卷积神经网络(CNN)层传递这个矩阵,然后是批处理归一化和最大池化层。随后,使用稀疏门控的MoE层。该层由一组专家网络(密集层)和一个路由器组成,该路由器为每个专家的输出分配权重。稀疏性是通过从全部专家集中只选择最相关的专家来实现的。最后,我们进行了一系列的烧蚀实验来验证我们提出的模型的有效性。实验分别在5G真实测试网络生成的网络入侵检测数据集5G- nidd数据集和O-RAN 5G测试平台数据集网络攻击数据集NANCY数据集上进行。结果表明,该方法在5G-NIDD和NANCY数据集上的准确率分别高达99.96%和79.59%。研究结果还表明,我们提出的模型与最先进的方法相比具有多种优势。
{"title":"Convolutional neural networks and mixture of experts for intrusion detection in 5G networks and beyond.","authors":"Loukas Ilias, George Doukas, Vangelis Lamprou, Christos Ntanos, Dimitris Askounis","doi":"10.3389/frai.2025.1708953","DOIUrl":"10.3389/frai.2025.1708953","url":null,"abstract":"<p><p>The advent of 6G/NextG networks offers numerous benefits, including extreme capacity, reliability, and efficiency. To mitigate emerging security threats, 6G/NextG networks incorporate advanced artificial intelligence algorithms. However, existing studies on intrusion detection predominantly rely on deep neural networks with static components that are not conditionally dependent on the input, thereby limiting their representational power and efficiency. To address these issues, we present the first study to integrate a Mixture of Experts (MoE) architecture for the identification of malicious traffic. Specifically, we use network traffic data and convert the 1D feature array into a 2D matrix. Next, we pass this matrix through a convolutional neural network (CNN) layer, followed by batch normalization and max pooling layers. Subsequently, a sparsely gated MoE layer is used. This layer consists of a set of expert networks (dense layers) and a router that assigns weights to each expert's output. Sparsity is achieved by selecting only the most relevant experts from the full set. Finally, we conduct a series of ablation experiments to demonstrate the effectiveness of our proposed model. Experiments are conducted on the 5G-NIDD dataset, a network intrusion detection dataset generated from a real 5G test network, and the NANCY dataset, which includes cyberattacks from the O-RAN 5G Testbed Dataset. The results show that our introduced approach achieves accuracies of up to 99.96% and 79.59% on the 5G-NIDD and NANCY datasets, respectively. The findings also show that our proposed model offers multiple advantages over state-of-the-art approaches.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1708953"},"PeriodicalIF":4.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12813042/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146012612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Frontiers in Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1