[This corrects the article DOI: 10.3389/frai.2025.1689182.].
[This corrects the article DOI: 10.3389/frai.2025.1689182.].
Precision healthcare is increasingly oriented toward the development of therapeutic strategies that are as individualized as the patients receiving them. Central to this paradigm shift is artificial intelligence (AI)-enabled multi-modal data integration, which consolidates heterogeneous data streams-including genomic, transcriptomic, proteomic, imaging, environmental, and electronic health record (EHR) data into a unified analytical framework. This integrative approach enhances early disease detection, facilitates the discovery of clinically actionable biomarkers, and accelerates rational drug development, with particularly significant implications for oncology, neurology, and cardiovascular medicine. Advanced machine learning (ML) and deep learning (DL) algorithms are capable of extracting complex, non-linear associations across data modalities, thereby improving diagnostic precision, enabling robust risk stratification, and informing patient-specific therapeutic interventions. Furthermore, AI-driven applications in digital health, such as wearable biosensors and real-time physiological monitoring, allow for continuous, dynamic refinement of treatment plans. This review examines the transformative potential of multi-modal AI in precision medicine, with emphasis on its role in multi-omics data integration, predictive modeling, and clinical decision support. In parallel, it critically evaluates prevailing challenges, including data interoperability, algorithmic bias, and ethical considerations surrounding patient privacy. The synergistic convergence of AI and multi-modal data represents not merely a technological innovation but a fundamental redefinition of individualized healthcare delivery.
Introduction: Patients with severe COVID-19 may require MV or ECMO. Predicting who will require interventions and the duration of those interventions are challenging due to the diverse responses among patients and the dynamic nature of the disease. As such, there is a need for better prediction of the duration and outcomes of MV use in patients, to improve patient care and aid with MV and ECMO allocation. Here we develop and examine the performance of ML models to predict MV duration, ECMO, and mortality for patients with COVID-19.
Methods: In this retrospective prognostic study, hierarchical machine-learning models were developed to predict MV duration and outcome prediction from demographic data and time-series data consisting of vital signs and laboratory results. We train our models on 10,378 patients with positive severe acute respiratory syndrome-related coronavirus (SARS-CoV-2) virus testing from Emory's COVID CRADLE Dataset who sought treatment at Emory University Hospital between February 28, 2020, to January 24, 2022. Analysis was conducted between January 10, 2022, and April 5, 2024. The main outcomes and measures were the AUROC, AUPRC and the F-score for MV duration, need for ECMO, and mortality prediction.
Results: Data from 10,378 patients with COVID-19 (median [IQR] age, 60 [48-72] years; 5,281 [50.89%] women) were included. Overall MV class distributions for 0 days, 1-4 days, 5-9 days, 10-14 days, 15-19 days, 20-24 days, 25-29 days, and ≥30 days of MV were 8,141 (78.44%), 812 (7.82%), 325 (3.13%), 241 (2.32%), 153 (1.47%), 97 (0.93%), 87 (0.84%), and 522 (5.03%), respectively. Overall ECMO use and mortality rates were 15 (0.14%) and 1,114 (10.73%), respectively. On MV duration, ECMO use, and mortality outcomes, the highest-performing model reached weighted average AUROC scores of 0.873, 0.902, and 0.774, and the highest-performing model reached weighted average AUPRC scores of 0.790, 0.999, and 0.893.
Conclusions and relevance: Hierarchical ML models trained on vital signs, laboratory results, and demographic data show promise for the prediction of MV duration, ECMO use, and mortality in COVID-19 patients.
Introduction: The launch of DeepSeek, a Chinese open-source generative AI model, generated substantial discussion regarding its capabilities and implications. The r/deepseek subreddit emerged as a key forum for real-time public evaluation. Analyzing this discourse is essential for understanding the sociotechnical perceptions shaping the integration of emerging AI systems.
Methods: We analyzed 46,649 posts and comments from r/deepseek (January-May 2025) using a computational framework combining VADER sentiment analysis, Hartmann emotion classification, BERTopic for thematic modeling, hyperlink extraction, and directed network analysis. Data preprocessing included cleaning, normalization, and lemmatization. We also examined correlations between sentiment/emotion scores and dominant topics.
Results: Sentiment was predominantly positive (posts: 47.23%; comments: 44.26%), with neutral sentiment comprising ~30% of content. The most frequent emotion was neutrality, followed by surprise and fear, indicating ambivalent user reactions. Prominent topics included open-source AI models, DeepSeek usage, device compatibility, comparisons with ChatGPT, and censorship concerns. Hyperlink analysis indicated strong engagement with GitHub, Hugging Face, and DeepSeek's own services. Network analysis revealed a fragmented but active community, depicting Open-Source AI Models as the most cohesive cluster.
Discussion: Community discourse framed DeepSeek as both a technical tool and a geopolitical issue. Enthusiasm centered on its performance, accessibility, and open-source nature, while concerns were voiced about censorship, data privacy, and potential ideological influence. The integrated analysis shows that collective perception emerged through decentralized, dialogic engagement, reflecting broader sociotechnical tensions related to openness, trust, and legitimacy in global AI development.
Introduction: To address the challenges of data heterogeneity, strategic diversity, and process opacity in interpreting multi-agent decision-making within complex competitive environments, we have developed TRACE, an end-to-end analytical framework for StarCraft II gameplay.
Methods: This framework standardizes raw replay data into aligned state trajectories, extracts "typical strategic progressions" using a Conditional Recurrent Variational Autoencoder (C-RVAE), and quantifies the deviation of individual games from these archetypes via counterfactual alignment. Its core innovation is the introduction of a dimensionless deviation metric, |Δ|, which achieves process-level interpretability. This metric reveals "which elements are important" by ranking time-averaged feature contributions across aggregated categories (Economy, Military, Technology) and shows "when deviations occur" through temporal heatmaps, forging a verifiable evidence chain..
Results: Quantitative evaluation on professional tournament datasets demonstrates the framework's robustness, revealing that strategic deviations often crystallize in the early game (averaging 8.4% of match duration) and are frequently driven by critical technology timing gaps. The counterfactual generation module effectively restores strategic alignment, achieving an average similarity improvement of over 90% by correcting identified divergences. Furthermore, expert human evaluation confirms the practical utility of the system, awarding high scores for Factual Fidelity (4.6/5.0) and Causal Coherence (4.3/5.0) to the automatically generated narratives.
Discussion: By providing openaccess code and reproducible datasets, TRACE lowers the barrier to large-scale replay analysis, offering an operational quantitative basis for macro-strategy understanding, coaching reviews, and AI model evaluation.
Background: The rapid evolution of interactive AI has reshaped human-computer interaction, with ChatGPT emerging as a key tool for chatbot development. Industries such as healthcare, customer service, and education increasingly integrate chatbots, highlighting the need for a structured development framework.
Purpose: This study proposes a framework for designing intelligent chatbots using ChatGPT, focusing on user experience, hybrid design models, prompt engineering, and system limitations. The framework aims to bridge the gap between technical innovation and real-world application.
Methods: A systematic literature review (SLR) was conducted, analyzing 40 relevant studies. The research was structured around three key questions: (1) How do user experience and engagement influence chatbot performance? (2) How do hybrid design models improve chatbot performance? (3) What are the limitations of using ChatGPT, and how does prompt engineering affect responses?
Results: The findings emphasize that well-designed user interactions enhance engagement and trust. Hybrid models integrating rule-based and machine learning techniques improve chatbot functionality. However, challenges such as response inconsistencies, ethical concerns, and prompt sensitivity require careful consideration. A framework for design, development, and implementation of effective Chatbots with ChatGPT has been proposed in this study.
Conclusion: This study provides a structured framework for chatbot development with ChatGPT, offering insights into optimizing user experience, leveraging hybrid design, and mitigating limitations. The proposed framework serves as a practical guide for researchers, developers, and businesses aiming to create intelligent, user-centric chatbot solutions.
Large pre-trained language models have become a crucial backbone for many downstream tasks in natural language processing (NLP), and while they are trained on a plethora of data containing a variety of biases, such as gender biases, it has been shown that they can also inherit such biases in their weights, potentially affecting their prediction behavior. However, it is unclear to what extent these biases also affect feature attributions generated by applying "explainable artificial intelligence" (XAI) techniques, possibly in unfavorable ways. To systematically study this question, we create a gender-controlled text dataset, GECO, in which the alteration of grammatical gender forms induces class-specific words and provides ground truth feature attributions for gender classification tasks. This enables an objective evaluation of the correctness of XAI methods. We apply this dataset to the pre-trained BERT model, which we fine-tune to different degrees, to quantitatively measure how pre-training induces undesirable bias in feature attributions and to what extent fine-tuning can mitigate such explanation bias. To this extent, we provide GECOBench, a rigorous quantitative evaluation framework for benchmarking popular XAI methods. We show a clear dependency between explanation performance and the number of fine-tuned layers, where XAI methods are observed to benefit particularly from fine-tuning or complete retraining of embedding layers.
The advent of 6G/NextG networks offers numerous benefits, including extreme capacity, reliability, and efficiency. To mitigate emerging security threats, 6G/NextG networks incorporate advanced artificial intelligence algorithms. However, existing studies on intrusion detection predominantly rely on deep neural networks with static components that are not conditionally dependent on the input, thereby limiting their representational power and efficiency. To address these issues, we present the first study to integrate a Mixture of Experts (MoE) architecture for the identification of malicious traffic. Specifically, we use network traffic data and convert the 1D feature array into a 2D matrix. Next, we pass this matrix through a convolutional neural network (CNN) layer, followed by batch normalization and max pooling layers. Subsequently, a sparsely gated MoE layer is used. This layer consists of a set of expert networks (dense layers) and a router that assigns weights to each expert's output. Sparsity is achieved by selecting only the most relevant experts from the full set. Finally, we conduct a series of ablation experiments to demonstrate the effectiveness of our proposed model. Experiments are conducted on the 5G-NIDD dataset, a network intrusion detection dataset generated from a real 5G test network, and the NANCY dataset, which includes cyberattacks from the O-RAN 5G Testbed Dataset. The results show that our introduced approach achieves accuracies of up to 99.96% and 79.59% on the 5G-NIDD and NANCY datasets, respectively. The findings also show that our proposed model offers multiple advantages over state-of-the-art approaches.
In this work we argue that, despite recent claims about successful modeling of the visual brain using deep nets, the problem is far from being solved, particularly for low-level vision. Open issues include where should we read from in ANNs to check behavior? What should be the read-out? Is this ad-hoc read-out considered part of the brain model or not? In order to understand vision-ANNs, should we use artificial psychophysics or artificial physiology? Anyhow, should artificial tests literally match the experiments done with humans? These questions suggest a clear need for biologically sensible tests for deep models of the visual brain, and more generally, to understand ANNs devoted to generic vision tasks. Following our use of low-level facts from Vision Science in Image Processing, we present a low-level dataset compiling the basic spatio-chromatic properties that describe the adaptive bottleneck of the retina-V1 pathway and are not currently available in popular databases such as BrainScore. We propose its use for qualitative and quantitative model evaluation. As an illustration of the proposed methods, we check the behavior of three recent models with similar deep architectures: (1) A parametric model tuned via the psychophysical method of Maximum Differentiation [Malo & Simoncelli SPIE 15, Martinez et al. PLOS 18, Martinez et al. Front. Neurosci. 19], (2) A non-parametric model (the PerceptNet) tuned to maximize the correlation with humans on subjective image distortions [Hepburn et al. IEEE ICIP 20], and (3) A model with the same encoder as the PerceptNet, but tuned for image segmentation [Hernandez-Camara et al. Patt.Recogn.Lett. 23, Hernandez-Camara et al. Neurocomp. 25]. Results on the proposed 10 compelling psycho/physio visual properties show that the first (parametric) model is the one with behavior closest to humans.
Introduction: Artificial intelligence (AI), particularly deep learning (DL), offers automated solutions for early detection of plant diseases to improve crop yield. However, training accurate models on real-field data remains challenging due to over fitting and limited generalization. As observed in prior studies, traditional CNNs often struggle with real-environment variability, and transfer learning can lead to instability in training on domain-specific leaf datasets. This study focuses on detecting dome galls, a disease in Cordia dichotoma, by formulating a binary classification task (healthy vs. diseased leaves) using a custom dataset of 3,900 leaf images collected from real field environments.
Methods: Initially, both custom CNNs and transfer learning models were trained and compared. Among them, a modified ResNet-50 architecture showed promising results but suffered from over fitting and unstable convergence. To address this, the final sigmoid activation layer was replaced with a Support Vector Machine (SVM), and L2 regularization was applied to reduce over fitting. This hybrid DeepSVM architecture stabilized training and improved model robustness. Image preprocessing and augmentation techniques were applied to increase variability and prevent over fitting.
Results: The final model was evaluated on a separate test set of 400 images, and the results remained stable across repeated runs. DeepSVM achieved an accuracy of 94.50% and an F1-score of 94.47%, outperforming other well-known models like VGG-16, InceptionResNetv2, and MobileNet-V2.
Conclusion: These results indicate that the proposed DeepSVM approach offers better generalization and training stability than conventional CNN classifiers, potentially aiding in automated disease monitoring for precision agriculture.

