Pub Date : 2026-01-12eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1685376
Batool Alabdullah, Suresh Sankaranarayanan
Introduction: The increasing cyber threats targeting industrial control systems (ICS) and the Internet of Things (IoT) pose significant risks, especially in critical infrastructures like the oil and gas sector. Existing machine learning (ML) approaches for cyberattack detection often rely on binary classification and lack computational efficiency.
Methods: This study proposes two optimized stacked ensemble models to enhance attack detection accuracy while reducing computational overhead. The main contribution lies in the strategic selection and integration of diverse base models, such as Logistic Regression, Extra Tree Classifier, XGBoost, and LGBM, with RFC as the final estimator. These models are chosen to address unique characteristics of security datasets, such as class imbalance, noise, and complex attack patterns. This combination aims to leverage different decision boundaries and learning mechanisms.
Results: Evaluations show that the Stacked Ensemble_2 model achieves 97% accuracy with a training and testing computation time of 54 minutes. Stacked Ensemble_2, which excelled over the traditional Stacked Ensemble_1, was also evaluated on the CICIDS 2017 dataset, achieving an impressive 100% accuracy with an AUROC of 99%.
Discussion: The results indicate that the proposed Stacked Ensemble_2 model provides a scalable, real-time detection mechanism for securing ICS and IoT environments. By proving its effectiveness on unseen data, this model demonstrates a significant advancement over traditional methods, offering enhanced accuracy and efficiency in detecting sophisticated cyber threats in critical infrastructure sectors.
导语:越来越多的针对工业控制系统(ICS)和物联网(IoT)的网络威胁构成了重大风险,特别是在石油和天然气行业等关键基础设施中。现有的机器学习(ML)网络攻击检测方法往往依赖于二进制分类,缺乏计算效率。方法:提出两种优化的堆叠集成模型,在降低计算开销的同时提高攻击检测精度。主要贡献在于策略性地选择和整合各种基本模型,如Logistic回归、Extra Tree Classifier、XGBoost和LGBM,并以RFC作为最终的估计器。选择这些模型是为了解决安全数据集的独特特征,例如类不平衡、噪声和复杂的攻击模式。这种组合旨在利用不同的决策边界和学习机制。结果:评价表明,该模型的训练和测试计算时间为54分钟,准确率达到97%。在CICIDS 2017数据集上,对优于传统堆叠Ensemble_1的堆叠Ensemble_2进行了评估,达到了令人印象深刻的100%准确率和99%的AUROC。讨论:结果表明,所提出的堆叠集成模型为保护ICS和物联网环境提供了一种可扩展的实时检测机制。通过证明其在看不见的数据上的有效性,该模型显示了比传统方法的重大进步,在检测关键基础设施部门的复杂网络威胁方面提供了更高的准确性和效率。
{"title":"Optimized ensemble machine learning model for cyberattack classification in industrial IoT.","authors":"Batool Alabdullah, Suresh Sankaranarayanan","doi":"10.3389/frai.2025.1685376","DOIUrl":"10.3389/frai.2025.1685376","url":null,"abstract":"<p><strong>Introduction: </strong>The increasing cyber threats targeting industrial control systems (ICS) and the Internet of Things (IoT) pose significant risks, especially in critical infrastructures like the oil and gas sector. Existing machine learning (ML) approaches for cyberattack detection often rely on binary classification and lack computational efficiency.</p><p><strong>Methods: </strong>This study proposes two optimized stacked ensemble models to enhance attack detection accuracy while reducing computational overhead. The main contribution lies in the strategic selection and integration of diverse base models, such as Logistic Regression, Extra Tree Classifier, XGBoost, and LGBM, with RFC as the final estimator. These models are chosen to address unique characteristics of security datasets, such as class imbalance, noise, and complex attack patterns. This combination aims to leverage different decision boundaries and learning mechanisms.</p><p><strong>Results: </strong>Evaluations show that the Stacked Ensemble_2 model achieves 97% accuracy with a training and testing computation time of 54 minutes. Stacked Ensemble_2, which excelled over the traditional Stacked Ensemble_1, was also evaluated on the CICIDS 2017 dataset, achieving an impressive 100% accuracy with an AUROC of 99%.</p><p><strong>Discussion: </strong>The results indicate that the proposed Stacked Ensemble_2 model provides a scalable, real-time detection mechanism for securing ICS and IoT environments. By proving its effectiveness on unseen data, this model demonstrates a significant advancement over traditional methods, offering enhanced accuracy and efficiency in detecting sophisticated cyber threats in critical infrastructure sectors.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1685376"},"PeriodicalIF":4.7,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12832753/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1550604
Filippos Ventirozos, Mauricio Jacobo-Romero, Haifa Alrdahi, Sarah Clinch, Riza Batista-Navarro
Despite recent advancements, modern kitchens, at best, have one or more isolated (non-communicating) "smart" devices. The vision of having a fully-fledged ambient kitchen where devices know what to do and when has yet to be realized. To address this, we present RiCoRecA, a novel schema for parsing cooking recipes into a workflow representation suitable for automation, a step toward that direction. Methodologically, the schema requires a number of information extraction tasks, i.e., annotating named entities, identifying relations between them, coreference resolution, and entity tracking. RiCoRecA differs from previously reported approaches in that it learns these different information extraction tasks using one joint model. We also provide a dataset containing annotations that follow this schema. Furthermore, we compared two transformer-based models for parsing recipes into workflows, namely, PEGASUS-X and LongT5. Our results demonstrate that PEGASUS-X surpassed LongT5 on all of the annotation tasks. Specifically, PEGASUS-X surpassed LongT5 by 39% in terms of F-Score when averaging the performance on all the tasks; it demonstrated almost human-like performance.
{"title":"RiCoRecA: <i>ri</i>ch <i>co</i>oking <i>rec</i>ipe <i>a</i>nnotation schema.","authors":"Filippos Ventirozos, Mauricio Jacobo-Romero, Haifa Alrdahi, Sarah Clinch, Riza Batista-Navarro","doi":"10.3389/frai.2025.1550604","DOIUrl":"10.3389/frai.2025.1550604","url":null,"abstract":"<p><p>Despite recent advancements, modern kitchens, at best, have one or more isolated (non-communicating) \"smart\" devices. The vision of having a fully-fledged ambient kitchen where devices know what to do and when has yet to be realized. To address this, we present RiCoRecA, a novel schema for parsing cooking recipes into a workflow representation suitable for automation, a step toward that direction. Methodologically, the schema requires a number of information extraction tasks, i.e., annotating named entities, identifying relations between them, coreference resolution, and entity tracking. RiCoRecA differs from previously reported approaches in that it learns these different information extraction tasks using one joint model. We also provide a dataset containing annotations that follow this schema. Furthermore, we compared two transformer-based models for parsing recipes into workflows, namely, PEGASUS-X and LongT5. Our results demonstrate that PEGASUS-X surpassed LongT5 on all of the annotation tasks. Specifically, PEGASUS-X surpassed LongT5 by 39% in terms of F-Score when averaging the performance on all the tasks; it demonstrated almost human-like performance.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1550604"},"PeriodicalIF":4.7,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12833278/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1692894
Daudi Mashauri Migayo, Shubi Kaijage, Stephen Swetala, Devotha G Nyambo
Bone fractures are among the most prominent injuries in the modern world that affect all ages and races. Traditional treatment involves radiographic imaging that relies heavily on radiologists manually analyzing images. There have been efforts to develop computer-aided diagnosis tools that employ artificial intelligence and deep learning approaches. Existing literature focuses on developing tools that only detect and classify bone fractures, rather than addressing the broader issue of bone fracture management. However, evidence of scholarly works that include treatment recommendations is still lacking. Furthermore, deep learning-based object detectors that achieve state-of-the-art results are computationally expensive and considered as black-box solutions. Developing countries, such as Sub-Saharan Africa, face a shortage of radiologists and orthopedists. For this reason, this paper proposes a methodological approach that uses a more efficient object detection model to diagnose long bone fractures and provide prescription recommendations. An enhanced anchoring process, known as adaptive anchoring, is proposed to improve the performance of the Regional Proposal Network and the object detection model. A Faster R-CNN model with ResNet-50/101 and ResNext-50/101 backbones was used to develop an object detection model that uses X-ray images as input. To understand and interpret the model's decision, a Gradient-based Class Activation Mapping method was used to assess the model's learnability. The results indicate that the proposed adaptive anchoring approach can improve computational efficiency, reducing training time by up to 29% compared to the traditional approach. Model accuracy during training and validation ranged between 94% and 98%. Overall, adaptive anchoring performed better when applied with the ResNet-101 backbone, yielding an Average Precision of 92.73%, an F1 score of 96.01%, a precision of 96.80%, and a recall of 95.23%. The study provides valuable insights into the use of computationally efficient deep learning models for medical recommendation systems. Future studies should develop models to diagnose fractures using input images from various modalities and to provide prescription recommendations.
{"title":"Enhanced multi-class object detector for bone fracture diagnosis with prescription recommendation.","authors":"Daudi Mashauri Migayo, Shubi Kaijage, Stephen Swetala, Devotha G Nyambo","doi":"10.3389/frai.2025.1692894","DOIUrl":"10.3389/frai.2025.1692894","url":null,"abstract":"<p><p>Bone fractures are among the most prominent injuries in the modern world that affect all ages and races. Traditional treatment involves radiographic imaging that relies heavily on radiologists manually analyzing images. There have been efforts to develop computer-aided diagnosis tools that employ artificial intelligence and deep learning approaches. Existing literature focuses on developing tools that only detect and classify bone fractures, rather than addressing the broader issue of bone fracture management. However, evidence of scholarly works that include treatment recommendations is still lacking. Furthermore, deep learning-based object detectors that achieve state-of-the-art results are computationally expensive and considered as black-box solutions. Developing countries, such as Sub-Saharan Africa, face a shortage of radiologists and orthopedists. For this reason, this paper proposes a methodological approach that uses a more efficient object detection model to diagnose long bone fractures and provide prescription recommendations. An enhanced anchoring process, known as adaptive anchoring, is proposed to improve the performance of the Regional Proposal Network and the object detection model. A Faster R-CNN model with ResNet-50/101 and ResNext-50/101 backbones was used to develop an object detection model that uses X-ray images as input. To understand and interpret the model's decision, a Gradient-based Class Activation Mapping method was used to assess the model's learnability. The results indicate that the proposed adaptive anchoring approach can improve computational efficiency, reducing training time by up to 29% compared to the traditional approach. Model accuracy during training and validation ranged between 94% and 98%. Overall, adaptive anchoring performed better when applied with the ResNet-101 backbone, yielding an Average Precision of 92.73%, an F1 score of 96.01%, a precision of 96.80%, and a recall of 95.23%. The study provides valuable insights into the use of computationally efficient deep learning models for medical recommendation systems. Future studies should develop models to diagnose fractures using input images from various modalities and to provide prescription recommendations.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1692894"},"PeriodicalIF":4.7,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12833394/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1679218
Sunkara Mounika, Reeja S R
Introduction: Detecting epileptic seizures remains a major challenge in clinical neurology due to the complex, heterogeneous, and non-stationary characteristics of electroencephalogram (EEG) signals. Although recent machine learning (ML) and deep learning (DL) approaches have improved detection performance, most methods still struggle with limited interpretability, inadequate spatial-temporal modeling, and suboptimal generalization. To address these limitations, this study proposes an enhanced hybrid parallel convolutional-GhostNet framework (HPG-ESD) for robust seizure detection using multimodal EEG and functional Magnetic Resonance Imaging (fMRI) data.
Methods: The experimental data consist of pediatric scalp EEG recordings from 24 subjects in the CHB-MIT dataset (22-channel 10-20 system, 256 Hz sampling, continuous multi-hour recordings) and resting-state 3T fMRI scans from 52 participants in the UNAM TLE dataset (26 epilepsy patients and 26 healthy controls). EEG data underwent Gauss-based median filtering, while fMRI images were denoised using an adaptive weight-based Wiener filter. Spatial, temporal, and spectral EEG features were extracted alongside an enhanced common spatial pattern (E-CSP) representation, whereas fMRI features were obtained using deep 3D CNN embeddings combined with a smoothened pyramid histogram of oriented gradients (S-PHOG) descriptor. These multimodal features were fused within a soft voting hybrid parallel convolutional-GhostNet (S-HPCGN) model, integrating an improved attention based parallel convolutional network (IAPCNet) and GhostNet to capture complementary spatial-temporal patterns.
Results: The proposed HPG-ESD framework achieved an accuracy of 0.941, precision of 0.939, and sensitivity of 0.944, outperforming conventional unimodal and state-of-the-art methods.
Discussion: These results demonstrate the potential of multi-modal learning and lightweight attention-enhanced architectures for reliable and clinically relevant seizure detection.
{"title":"Improved attention-based PCNN with GhostNet for epilepsy seizure detection using EEG and fMRI modalities: extractive pattern and histogram feature set.","authors":"Sunkara Mounika, Reeja S R","doi":"10.3389/frai.2025.1679218","DOIUrl":"10.3389/frai.2025.1679218","url":null,"abstract":"<p><strong>Introduction: </strong>Detecting epileptic seizures remains a major challenge in clinical neurology due to the complex, heterogeneous, and non-stationary characteristics of electroencephalogram (EEG) signals. Although recent machine learning (ML) and deep learning (DL) approaches have improved detection performance, most methods still struggle with limited interpretability, inadequate spatial-temporal modeling, and suboptimal generalization. To address these limitations, this study proposes an enhanced hybrid parallel convolutional-GhostNet framework (HPG-ESD) for robust seizure detection using multimodal EEG and functional Magnetic Resonance Imaging (fMRI) data.</p><p><strong>Methods: </strong>The experimental data consist of pediatric scalp EEG recordings from 24 subjects in the CHB-MIT dataset (22-channel 10-20 system, 256 Hz sampling, continuous multi-hour recordings) and resting-state 3T fMRI scans from 52 participants in the UNAM TLE dataset (26 epilepsy patients and 26 healthy controls). EEG data underwent Gauss-based median filtering, while fMRI images were denoised using an adaptive weight-based Wiener filter. Spatial, temporal, and spectral EEG features were extracted alongside an enhanced common spatial pattern (E-CSP) representation, whereas fMRI features were obtained using deep 3D CNN embeddings combined with a smoothened pyramid histogram of oriented gradients (S-PHOG) descriptor. These multimodal features were fused within a soft voting hybrid parallel convolutional-GhostNet (S-HPCGN) model, integrating an improved attention based parallel convolutional network (IAPCNet) and GhostNet to capture complementary spatial-temporal patterns.</p><p><strong>Results: </strong>The proposed HPG-ESD framework achieved an accuracy of 0.941, precision of 0.939, and sensitivity of 0.944, outperforming conventional unimodal and state-of-the-art methods.</p><p><strong>Discussion: </strong>These results demonstrate the potential of multi-modal learning and lightweight attention-enhanced architectures for reliable and clinically relevant seizure detection.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1679218"},"PeriodicalIF":4.7,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12850516/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146087263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1700493
Shimaa Azzam, El-Morsy Ahmed El-Morsy, Amira S A Said, Nermin Eissa, Doaa Mahmoud Khalil
Background: Healthcare professionals' awareness and handling of artificial intelligence applications in healthcare enhance patient outcomes and improve processes. This study aimed to evaluate the perception, attitude, knowledge, and practice of healthcare professionals regarding the application of artificial intelligence in Egyptian healthcare settings.
Method: A cross-sectional study in which 367 healthcare professionals responded to an electronic questionnaire.
Results: Out of 367 participants (234 female), radiology and lab test specialty (36.2%) was the predominant. The mean age was 27.03 years; 51.8% of respondents showed positive perception, 68.7% experienced sub-optimal knowledge, 52.9% expressed negative attitudes, and 53.4% demonstrated a low practice level of AI tools. Younger age was significantly associated with positive perception (adjusted odds ratio (AOR) = 0.905, p = 0.020) and higher AI practice (AOR = 0.907, p = 0.026). University hospital professionals had 61.4% lower odds of optimal knowledge than private hospital professionals (AOR = 0.386, p = 0.046). Men had higher odds of both positive attitudes (AOR = 1.844, p = 0.010) and high practice level (AOR = 2.92, p < 0.001). Pre-bachelor's holders had lower odds of positive attitudes (AOR = 0.361, p = 0.036), as well as physicians compared to nurses and others (AOR = 0.424, p = 0.005). Bachelor's holders showed lower odds of high AI practice (AOR = 0.388, p = 0.017).
Conclusion: Despite moderate perception, most professionals have knowledge, attitude, and practice defects. Mainly, younger age and men showed higher engagement, indicating a need for targeted AI training, especially for older and female professionals.
{"title":"Perception and awareness of healthcare professionals toward the applications of artificial intelligence in Egyptian healthcare settings.","authors":"Shimaa Azzam, El-Morsy Ahmed El-Morsy, Amira S A Said, Nermin Eissa, Doaa Mahmoud Khalil","doi":"10.3389/frai.2025.1700493","DOIUrl":"10.3389/frai.2025.1700493","url":null,"abstract":"<p><strong>Background: </strong>Healthcare professionals' awareness and handling of artificial intelligence applications in healthcare enhance patient outcomes and improve processes. This study aimed to evaluate the perception, attitude, knowledge, and practice of healthcare professionals regarding the application of artificial intelligence in Egyptian healthcare settings.</p><p><strong>Method: </strong>A cross-sectional study in which 367 healthcare professionals responded to an electronic questionnaire.</p><p><strong>Results: </strong>Out of 367 participants (234 female), radiology and lab test specialty (36.2%) was the predominant. The mean age was 27.03 years; 51.8% of respondents showed positive perception, 68.7% experienced sub-optimal knowledge, 52.9% expressed negative attitudes, and 53.4% demonstrated a low practice level of AI tools. Younger age was significantly associated with positive perception (adjusted odds ratio (AOR) = 0.905, <i>p</i> = 0.020) and higher AI practice (AOR = 0.907, <i>p</i> = 0.026). University hospital professionals had 61.4% lower odds of optimal knowledge than private hospital professionals (AOR = 0.386, <i>p</i> = 0.046). Men had higher odds of both positive attitudes (AOR = 1.844, <i>p</i> = 0.010) and high practice level (AOR = 2.92, <i>p</i> < 0.001). Pre-bachelor's holders had lower odds of positive attitudes (AOR = 0.361, <i>p</i> = 0.036), as well as physicians compared to nurses and others (AOR = 0.424, <i>p</i> = 0.005). Bachelor's holders showed lower odds of high AI practice (AOR = 0.388, <i>p</i> = 0.017).</p><p><strong>Conclusion: </strong>Despite moderate perception, most professionals have knowledge, attitude, and practice defects. Mainly, younger age and men showed higher engagement, indicating a need for targeted AI training, especially for older and female professionals.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1700493"},"PeriodicalIF":4.7,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12833234/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1702056
Xiaoji Li, Zhiyuan Wang
Introduction: This study addresses the challenge of improving wavefront correction for Orbital Angular Momentum (OAM) in oceanic turbulence using a physics-constrained Generative Adversarial Network (GAN).
Methods: We integrated physical constraints into a deep learning framework to reconstruct degraded input images (SSIM = 0.62). The model was trained with varied loss settings, including a baseline model, spectral constraints (+Spec), and spatial constraints (+Ortho).
Results: The dual-constraint approach (+Ortho+Spec) reached a near-optimal SSIM of 0.98. Ablation studies revealed that while +Ortho boosted modal purity to 95.7%, the dual-constraints achieved 98.4% purity. Power spectral density analysis via KL divergence confirmed the dual-constraints' superiority (KL = 0.56) over the baseline (KL = 2.47).
Discussion: These results demonstrate that integrating both spatial and spectral constraints effectively optimizes reconstruction, purity, and spectral fidelity, offering a robust solution for OAM correction in underwater optical communication systems.
{"title":"Physics-constrained GAN boosts OAM correction in ocean turbulence.","authors":"Xiaoji Li, Zhiyuan Wang","doi":"10.3389/frai.2025.1702056","DOIUrl":"10.3389/frai.2025.1702056","url":null,"abstract":"<p><strong>Introduction: </strong>This study addresses the challenge of improving wavefront correction for Orbital Angular Momentum (OAM) in oceanic turbulence using a physics-constrained Generative Adversarial Network (GAN).</p><p><strong>Methods: </strong>We integrated physical constraints into a deep learning framework to reconstruct degraded input images (SSIM = 0.62). The model was trained with varied loss settings, including a baseline model, spectral constraints (+Spec), and spatial constraints (+Ortho).</p><p><strong>Results: </strong>The dual-constraint approach (+Ortho+Spec) reached a near-optimal SSIM of 0.98. Ablation studies revealed that while +Ortho boosted modal purity to 95.7%, the dual-constraints achieved 98.4% purity. Power spectral density analysis via KL divergence confirmed the dual-constraints' superiority (KL = 0.56) over the baseline (KL = 2.47).</p><p><strong>Discussion: </strong>These results demonstrate that integrating both spatial and spectral constraints effectively optimizes reconstruction, purity, and spectral fidelity, offering a robust solution for OAM correction in underwater optical communication systems.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1702056"},"PeriodicalIF":4.7,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12833431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146067420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1743016
Anusree Ambady, Thomas K V
{"title":"Persona pedagogica in crisis: are educators becoming data custodians in the age of AI?","authors":"Anusree Ambady, Thomas K V","doi":"10.3389/frai.2025.1743016","DOIUrl":"https://doi.org/10.3389/frai.2025.1743016","url":null,"abstract":"","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1743016"},"PeriodicalIF":4.7,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12827719/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146054015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1716706
Maria Vaida, Ziyuan Huang
Graph Neural Networks (GNNs) have transformed multimodal healthcare data integration by capturing complex, non-Euclidean relationships across diverse sources such as electronic health records, medical imaging, genomic profiles, and clinical notes. This review synthesizes GNN applications in healthcare, highlighting their impact on clinical decision-making through multimodal integration, advanced fusion strategies, and attention mechanisms. Key applications include drug interaction and discovery, cancer detection and prognosis, clinical status prediction, infectious disease modeling, genomics, and the diagnosis of mental health and neurological disorders. Various GNN architectures demonstrate consistent applications in modeling both intra- and intermodal relationships. GNN architectures, such as Graph Convolutional Networks and Graph Attention Networks, are integrated with Convolutional Neural Networks (CNNs), transformer-based models, temporal encoders, and optimization algorithms to facilitate robust multimodal integration. Early, intermediate, late, and hybrid fusion strategies, enhanced by attention mechanisms like multi-head attention, enable dynamic prioritization of critical relationships, improving accuracy and interpretability. However, challenges remain, including data heterogeneity, computational demands, and the need for greater interpretability. Addressing these challenges presents opportunities to advance GNN adoption in medicine through scalable, transparent GNN models.
{"title":"Multimodal graph neural networks in healthcare: a review of fusion strategies across biomedical domains.","authors":"Maria Vaida, Ziyuan Huang","doi":"10.3389/frai.2025.1716706","DOIUrl":"10.3389/frai.2025.1716706","url":null,"abstract":"<p><p>Graph Neural Networks (GNNs) have transformed multimodal healthcare data integration by capturing complex, non-Euclidean relationships across diverse sources such as electronic health records, medical imaging, genomic profiles, and clinical notes. This review synthesizes GNN applications in healthcare, highlighting their impact on clinical decision-making through multimodal integration, advanced fusion strategies, and attention mechanisms. Key applications include drug interaction and discovery, cancer detection and prognosis, clinical status prediction, infectious disease modeling, genomics, and the diagnosis of mental health and neurological disorders. Various GNN architectures demonstrate consistent applications in modeling both intra- and intermodal relationships. GNN architectures, such as Graph Convolutional Networks and Graph Attention Networks, are integrated with Convolutional Neural Networks (CNNs), transformer-based models, temporal encoders, and optimization algorithms to facilitate robust multimodal integration. Early, intermediate, late, and hybrid fusion strategies, enhanced by attention mechanisms like multi-head attention, enable dynamic prioritization of critical relationships, improving accuracy and interpretability. However, challenges remain, including data heterogeneity, computational demands, and the need for greater interpretability. Addressing these challenges presents opportunities to advance GNN adoption in medicine through scalable, transparent GNN models.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1716706"},"PeriodicalIF":4.7,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12827511/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146054041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Introduction: Railway transportation is increasingly critical for modern urban and intercity mobility. However, the expanding scale and intensifying operational intensity of rail networks have elevated track defect detection to a key concern. Traditional inspection methods (manual, ultrasonic, eddy current, magnetic flux leakage testing) are limited by insufficient accuracy, low efficiency, or poor adaptability to complex environmental conditions.
Methods: An enhanced defect detection framework based on an improved YOLOv8 algorithm was proposed, tailored for small targets and complex backgrounds. Three core improvements were integrated: 1) AVCStem module with variable convolution kernels to dynamically adapt to defects of different shapes and scales; 2) ADSPPF module using multi-scale pooling and multi-branch attention mechanisms to preserve fine-grained features across scales; 3) MSF module for enhanced multi-scale feature fusion via partial convolution and hierarchical feature alignment.
Results and discussion: Experiments on a real-world track defect dataset showed the proposed model achieved 90.2% detection precision, 90.2% mAP@0.5, and 73.2% mAP@0.5:0.95. Meanwhile, the model size was reduced to 5.2MB with 2.45M parameters. Comparative and ablation studies confirmed the complementary advantages of each module and the model's superior performance over existing lightweight detectors. The proposed model provides a robust, accurate, and efficient solution for real-time railway defect detection. It exhibits strong potential for deployment in edge AI devices and mobile inspection robots, addressing the limitations of traditional inspection methods.
{"title":"An improved YOLOv8n with multi-scale feature fusion for real time and high precision railway track defect detection.","authors":"Zhihong Zhang, Liling Zhang, Xin Lu, Tingting Ma, Feng Huang, Sheng Zhong","doi":"10.3389/frai.2025.1711309","DOIUrl":"10.3389/frai.2025.1711309","url":null,"abstract":"<p><strong>Introduction: </strong>Railway transportation is increasingly critical for modern urban and intercity mobility. However, the expanding scale and intensifying operational intensity of rail networks have elevated track defect detection to a key concern. Traditional inspection methods (manual, ultrasonic, eddy current, magnetic flux leakage testing) are limited by insufficient accuracy, low efficiency, or poor adaptability to complex environmental conditions.</p><p><strong>Methods: </strong>An enhanced defect detection framework based on an improved YOLOv8 algorithm was proposed, tailored for small targets and complex backgrounds. Three core improvements were integrated: 1) AVCStem module with variable convolution kernels to dynamically adapt to defects of different shapes and scales; 2) ADSPPF module using multi-scale pooling and multi-branch attention mechanisms to preserve fine-grained features across scales; 3) MSF module for enhanced multi-scale feature fusion via partial convolution and hierarchical feature alignment.</p><p><strong>Results and discussion: </strong>Experiments on a real-world track defect dataset showed the proposed model achieved 90.2% detection precision, 90.2% mAP@0.5, and 73.2% mAP@0.5:0.95. Meanwhile, the model size was reduced to 5.2MB with 2.45M parameters. Comparative and ablation studies confirmed the complementary advantages of each module and the model's superior performance over existing lightweight detectors. The proposed model provides a robust, accurate, and efficient solution for real-time railway defect detection. It exhibits strong potential for deployment in edge AI devices and mobile inspection robots, addressing the limitations of traditional inspection methods.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1711309"},"PeriodicalIF":4.7,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12827737/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146054039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09eCollection Date: 2025-01-01DOI: 10.3389/frai.2025.1717129
Petr Weinlich, Tereza Semeradova
This study examines the phenomenon of content freebooting on social media and its exploitation for marketing counterfeit and "dupe" products. Using a four-week dataset of TikTok ads linked to 32 distinct e-commerce domains, we develop and evaluate a multimodal provenance pipeline-combining perceptual hashing, audio fingerprinting, vision embeddings, and natural-language clustering-applied to 54 ads, 180 landing pages, and over 3,000 extracted video frames. The primary contribution is methodological: multimodal late-fusion substantially outperforms single-modality detectors in identifying copyright-infringing reuse of creator content under adversarial transformations. Empirically, we document systematic asset theft from legitimate fashion creators, with several videos and review images reappearing across more than 10 separate domains. Purchases from three advertised shops, alongside control items, reveal systematic misrepresentation of product quality and unreliable fulfillment, situating freebooted ads at the intersection of copyright infringement, trademark-like "dupe" positioning, deceptive advertising, and consumer fraud. Network analysis of ad handles and domains indicates a coordinated cluster of shell actors, with a median time-to-reupload of 18 h. As a secondary contribution, the study uses this provenance pipeline to illuminate how freebooted cultural assets are rapidly converted into counterfeit-linked sales, and to surface gaps in platform integrity and consumer protection. By integrating computer vision, audio analysis, and NLP techniques with network and fulfillment audits, the paper offers both a methodological framework for analyzing freebooting pipelines and socio-technical insights for platform governance in digital commerce.
{"title":"Detecting freebooted content in social media ads: multimodal provenance and e-commerce implications.","authors":"Petr Weinlich, Tereza Semeradova","doi":"10.3389/frai.2025.1717129","DOIUrl":"10.3389/frai.2025.1717129","url":null,"abstract":"<p><p>This study examines the phenomenon of content freebooting on social media and its exploitation for marketing counterfeit and \"dupe\" products. Using a four-week dataset of TikTok ads linked to 32 distinct e-commerce domains, we develop and evaluate a multimodal provenance pipeline-combining perceptual hashing, audio fingerprinting, vision embeddings, and natural-language clustering-applied to 54 ads, 180 landing pages, and over 3,000 extracted video frames. The primary contribution is methodological: multimodal late-fusion substantially outperforms single-modality detectors in identifying copyright-infringing reuse of creator content under adversarial transformations. Empirically, we document systematic asset theft from legitimate fashion creators, with several videos and review images reappearing across more than 10 separate domains. Purchases from three advertised shops, alongside control items, reveal systematic misrepresentation of product quality and unreliable fulfillment, situating freebooted ads at the intersection of copyright infringement, trademark-like \"dupe\" positioning, deceptive advertising, and consumer fraud. Network analysis of ad handles and domains indicates a coordinated cluster of shell actors, with a median time-to-reupload of 18 h. As a secondary contribution, the study uses this provenance pipeline to illuminate how freebooted cultural assets are rapidly converted into counterfeit-linked sales, and to surface gaps in platform integrity and consumer protection. By integrating computer vision, audio analysis, and NLP techniques with network and fulfillment audits, the paper offers both a methodological framework for analyzing freebooting pipelines and socio-technical insights for platform governance in digital commerce.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1717129"},"PeriodicalIF":4.7,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12827671/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146053993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}