Background: Consumer-level drug recalls usually require action by individual patients. The Food and Drug Administration (FDA) has public-facing outlets to inform the public about drug safety information, including all recalls, but individual consumers may not be aware of them. And there is no system in place to notify individual prescribers which of their patients are affected by a specific recall.
Objective: We aimed to leverage the FDA's Healthy Citizen prototype web-based software platform, which provides users with information about recalls, to automatically notify patients of relevant recalls.
Methods: We developed and evaluated an electronic notification system in the primary care and cardiology practices at a large, urban, academic medical center. The health care portal scanned the FDA Healthy Citizen application programming interface nightly to detect new recalls, identified patients who had those medications in their electronic health record (EHR) medication list, and sent them a message through the EHR patient portal with a link to a customized FDA information display. Using structured interviews, we assessed qualitative feedback on the system and portal messaging from a convenience sample of 9 patients.
Results: The system was technically functional, but it was not possible to trace a medication prescription from the EHR to specific lot numbers dispensed to that patient by a community pharmacy. The qualitative feedback obtained from patients showed convergence of topics.
Conclusions: Lack of an accurate electronic audit trail from prescription to dispensed medication precludes clinical deployment of automated drug recall notification.
Background: Despite international efforts, maternal, newborn, and child health (MNCH) outcomes in Africa continue to lag due to inefficient health systems and underperforming financial frameworks. Financial factors-such as total health expenditure, health coverage indices, and spending per capita-are key but understudied drivers of MNCH service efficiency.
Objective: This study investigates the extent to which financial inputs influence the technical efficiency of MNCH service delivery across 46 African countries. The aim is to generate evidence for health financing policies that can enhance both efficiency and health equity.
Methods: We adopted a 2-stage analytical framework. First, data envelopment analysis using a variable returns-to-scale, input-oriented model was applied to measure technical efficiency. Second, Tobit regression identified the financial determinants of inefficiency. Explanatory variables included current health expenditures, a health coverage index, and current health expenditures per capita.
Results: Only 12 of 46 countries (26%) achieved full technical efficiency (efficiency score=1), while the rest (n=34, 74%) were inefficient, with a mean score of 0.849. Efficiency was notably lower in low-income countries (mean 0.810) compared to upper-middle-income countries (mean 0.940). Tobit regression showed that increased current health expenditure significantly reduced inefficiency (β=-.0811; P=.001). Conversely, a higher health coverage index unexpectedly increased inefficiency (β=.0155; P=.001), suggesting that expanded coverage without improved governance or resource capacity may strain systems. Health expenditure per capita was not statistically significant. Model 2 demonstrated stronger explanatory power (pseudo R²=0.8943).
Conclusions: Financial factors, particularly total health expenditure, play a decisive role in shaping MNCH efficiency across African nations. However, expanding health coverage without parallel improvements in system governance may exacerbate inefficiencies. To enhance MNCH outcomes, policy efforts must focus on increasing and strategically allocating financial resources while strengthening institutional accountability and performance.
Background: Studies have shown that large language models (LLMs) are promising in therapeutic decision-making, with findings comparable to those of medical experts, but these studies used highly curated patient data.
Objective: This study aimed to determine if LLMs can make guideline-concordant treatment decisions based on patient data as typically present in clinical practice (lengthy, unstructured medical text).
Methods: We conducted a retrospective study of 80 patients with severe aortic stenosis who were scheduled for either surgical (SAVR; n=24) or transcatheter aortic valve replacement (TAVR; n=56) by our institutional heart team in 2022. Various LLMs (BioGPT, GPT-3.5, GPT-4, GPT-4 Turbo, GPT-4o, LLaMA-2, Mistral, PaLM 2, and DeepSeek-R1) were queried using either anonymized original medical reports or manually generated case summaries to determine the most guideline-concordant treatment. We measured agreement with the heart team using Cohen κ coefficients, reliability using intraclass correlation coefficients (ICCs), and fairness using the frequency bias index (FBI; FBI >1 indicated bias toward TAVR).
Results: When presented with original medical reports, LLMs showed poor performance (Cohen κ coefficient: -0.47 to 0.22; ICC: 0.0-1.0; FBI: 0.95-1.51). The LLMs' performance improved substantially when case summaries were used as input and additional guideline knowledge was added to the prompt (Cohen κ coefficient: -0.02 to 0.63; ICC: 0.01-1.0; FBI: 0.46-1.23). Qualitative analysis revealed instances of hallucinations in all LLMs tested.
Conclusions: Even advanced LLMs require extensively curated input for informed treatment decisions. Unreliable responses, bias, and hallucinations pose significant health risks and highlight the need for caution in applying LLMs to real-world clinical decision-making.
Background: The COVID-19 pandemic disrupted essential health care services globally, including routine childhood immunization programs. Ecuador faced significant challenges in maintaining vaccination coverage during this period.
Objective: The aim of this study is to analyze the impact of the COVID-19 pandemic on routine childhood vaccination coverage in Ecuador by comparing prepandemic (2019) and pandemic (2020-2021) data.
Methods: This retrospective observational study analyzed vaccination coverage data from the Ministry of Public Health of Ecuador and demographic data from the National Institute of Statistics and Censuses. We examined routine childhood vaccination coverage for children under 24 months across all 24 provinces. Statistical analyses were performed using SPSS (version 28.0), including descriptive statistics and comparative analysis. Coverage rates were calculated as percentages of children in target age groups receiving recommended doses.
Results: A significant decline in routine childhood vaccination coverage was observed during the pandemic. BCG vaccine coverage decreased from 86.4% in 2019 (n=286,569) to 80.7% in 2020 (n=266,961) and 75.3% in 2021 (n=248,812). Pentavalent vaccine third dose coverage dropped from 85.0% to 68.0% across the same period. The most dramatic decline was seen in measles-mumps-rubella vaccine second dose coverage, falling from 75.7% in 2019 to 58.4% in 2021. Coastal and highland provinces experienced the most severe reductions, with approximately 137,000 fewer doses administered in 2020 compared to stable prepandemic levels.
Conclusions: The COVID-19 pandemic significantly impacted routine childhood vaccination coverage in Ecuador, with sustained declines through 2021. Regional disparities were evident, with vulnerable populations facing greater challenges accessing immunization services. Urgent interventions, including catch-up campaigns and strengthened health systems, are needed to restore coverage levels and prevent outbreaks of vaccine-preventable diseases.
Background: Artificial intelligence (AI) has evolved through various trends, with different subfields gaining prominence over time. Currently, conversational AI-particularly generative AI-is at the forefront. Conversational AI models are primarily focused on text-based tasks and are commonly deployed as chatbots. Recent advancements by OpenAI have enabled the integration of external, independently developed models, allowing chatbots to perform specialized, task-oriented functions beyond general language processing.
Objective: This study aims to develop a smart chatbot that integrates large language models from OpenAI with specialized domain-specific models, such as those used in medical image diagnostics. The system leverages transfer learning via Google's Teachable Machine to construct image-based classifiers and incorporates a diabetes detection model developed in TensorFlow.js. A key innovation is the chatbot's ability to extract relevant parameters from user input, trigger the appropriate diagnostic model, interpret the output, and deliver responses in natural language. The overarching goal is to demonstrate the potential of combining large language models with external models to build multimodal, task-oriented conversational agents.
Methods: Two image-based models were developed and integrated into the chatbot system. The first analyzes chest X-rays to detect viral and bacterial pneumonia. The second uses optical coherence tomography images to identify ocular conditions such as drusen, choroidal neovascularization, and diabetic macular edema. Both models were incorporated into the chatbot to enable image-based medical query handling. In addition, a text-based model was constructed to process physiological measurements for diabetes prediction using TensorFlow.js. The architecture is modular; new diagnostic models can be added without redesigning the chatbot, enabling straightforward functional expansion.
Results: The findings demonstrate effective integration between the chatbot and the diagnostic models, with only minor deviations from expected behavior. Additionally, a stub function was implemented within the chatbot to schedule medical appointments based on the severity of a patient's condition, and it was specifically tested with the optical coherence tomography and X-ray models.
Conclusions: This study demonstrates the feasibility of developing advanced AI systems-including image-based diagnostic models and chatbot integration-by leveraging AI as a service. It also underscores the potential of AI to enhance user experiences in bioinformatics, paving the way for more intuitive and accessible interfaces in the field. Looking ahead, the modular nature of the chatbot allows for the integration of additional diagnostic models as the system evolves.
Background: In health care providers' performance assessment, standardized incidence ratios are essential tools used to assess whether observed event rates deviate from expected values. Accurate estimation of variance in these ratios is crucial as it affects decision-making regarding providers' performance. There is little data on how the choice of these variance estimation methods affects decision-making.
Objective: In this study, we compared 3 methods (the delta method, bootstrapping method, and Bayesian approach) to estimate the variance of the logarithm of the standardized incidence ratio.
Methods: Using patient-level data from the Australia and New Zealand Dialysis and Transplant Registry for 2012-2023, we used a random effects model to predict treatment at home 1 year after starting treatment. We compared the 3 approaches (with more than 5000 iterations for bootstrapping and Markov chain Monte Carlo sampling) using bias, variance, and mean squared error (MSE) as performance measures. Using the 3 methods, funnel plots were used to compare the hospitals' performance in treating Indigenous and non-Indigenous patients close to home, as a service-level measure of equity.
Results: The bias values across all methods were similar, with the Bayesian method narrowly having the lowest bias (0.01922), followed by the delta method (0.01927) and bootstrap method (0.02567). In addition, the Bayesian method exhibited the lowest variance (0.00005), indicating more stable and less dispersed estimates. The delta method had a higher variance (0.00016), while the bootstrap method had the highest variance (0.00027), meaning it introduced more uncertainty. Finally, the Bayesian method had the lowest MSE (0.00042), indicating better overall accuracy, while the bootstrap method had the highest MSE (0.00094), showing it was the least reliable method.
Conclusions: We demonstrated that these methods can be used to measure equity for patient-centered outcomes, both within and between service providers simultaneously. The choice of variance estimation method is critical and heavily affects the interpretation of the performance of health service providers. We favor the Bayesian Markov chain Monte Carlo method as it was found to be a better approach.
Background: Remote patient monitoring systems face critical challenges in real-time vital sign analysis and secure data transmission.
Objective: This study aimed to develop a novel architecture integrating deep learning with 5G networks for real-time vital sign monitoring and prediction.
Methods: A hybrid convolutional neural network-long short-term memory model with attention mechanisms was optimized for edge deployment using 5G ultrareliable low-latency communication. The system incorporated end-to-end encryption and HIPAA (Health Insurance Portability and Accountability Act) compliance. Performance was evaluated over 3 months using data from 1000 patients.
Results: The system demonstrated superior prediction accuracy and significantly reduced latency compared to existing solutions. Performance remained stable under adverse network conditions and across diverse patient populations, supporting thousands of concurrent monitoring sessions.
Conclusions: This framework addresses security, scalability, and robustness requirements for clinical implementation, potentially improving patient outcomes through early detection of deteriorating conditions.

