Background: Patients with acute coronary syndrome (ACS) who undergo percutaneous coronary intervention (PCI) remain at high risk for major adverse cardiovascular events (MACE). Conventional risk scores may not capture dynamic or nonlinear changes in postdischarge MACE risk, whereas machine learning (ML) approaches can improve predictive performance. However, few ML models have incorporated time-to-event analysis to reflect changes in MACE risk over time.
Objective: This study aimed to develop a time-to-event ML model for predicting MACE after PCI in patients with ACS and to identify the risk factors with time-varying contributions.
Methods: We analyzed electronic health records of 3159 patients with ACS who underwent PCI at a tertiary hospital in South Korea between 2008 and 2020. Six time-to-event ML models were developed using 54 variables. Model performance was evaluated using the time-dependent concordance index and Brier score. Variable importance was assessed using permutation importance and visualized with partial dependence plots to identify variables contributing to MACE risk over time.
Results: During a median follow-up of 3.8 years, 626 (19.8%) patients experienced MACE. The best-performing model achieved a time-dependent concordance index of 0.743 at day 30 and 0.616 at 1 year. Time-dependent Brier scores increased and remained stable across all ML models. Key predictors included contrast volume, age, medication adherence, coronary artery disease severity, and glomerular filtration rate. Contrast volume ≥300 mL, age ≥60 years, and medication adherence score ≥30 were associated with early postdischarge risk, whereas coronary artery disease severity and glomerular filtration rate became more influential beyond 60 days.
Conclusions: The proposed time-to-event ML model effectively captured dynamic risk patterns after PCI and identified key predictors with time-varying effects. These findings may support individualized postdischarge management and early intervention strategies to prevent MACE in high-risk patients.
Background: Developing computable phenotypes (CP) based on electronic health records (EHR) data requires "gold-standard" labels for the outcome of interest. To generate these labels, clinicians typically chart-review a subset of patient charts. Charts to be reviewed are most often randomly sampled from the larger set of patients of interest. However, random sampling may fail to capture the diversity of the patient population, particularly if smaller subpopulations exist among those with the condition of interest. This can lead to poorly performing and biased CPs.
Objective: This study aimed to propose an unsupervised sampling approach designed to better capture a diverse patient cohort and improve the information coverage of chart review samples.
Methods: Our coverage sampling method starts by clustering by the patient population of interest. We then perform a stratified sampling from each cluster to ensure even representation for the chart review sample. We introduce a novel metric, nearest neighbor distance, to evaluate the coverage of the generated sample. To evaluate our method, we first conducted a simulation study to model and compare the performance of random versus our proposed coverage sampling. We varied the size and number of subpopulations within the larger cohort. Finally, we apply our approach to a real-world data set to develop a CP for hospitalization due to COVID-19. We evaluate the different sampling strategies based on the information coverage as well as the performance of the learned CP, using the area under the receiver operator characteristic curve.
Results: Our simulation studies show that the unsupervised coverage sampling approach provides broader coverage of patient populations compared to random sampling. When there are no underlying subpopulations, both random and coverage perform equally well for CP development. When there are subgroups, coverage sampling achieves area under the receiver operating characteristic curve gains of approximately 0.03-0.05 over random sampling. In the real-world application, the approach also outperformed random sampling, generating both a more representative sample and an area under the receiver operating characteristic curve improvement of 0.02 (95% CI -0.08 to 0.04).
Conclusions: The proposed coverage sampling method is an easy-to-implement approach that produces a chart review sample that is more representative of the source population. This allows one to learn a CP that has better performance both for subpopulations and the overall cohort. Studies that aim to develop CPs should consider alternative strategies other than randomly sampling patient charts.
Background: The Computerized Digit Vigilance Test (CDVT) is a well-established measure of sustained attention. However, the CDVT only measures the total reaction time and response accuracy and fails to capture other crucial attentional features such as the eye blink rate, yawns, head movements, and eye movements. Omitting such features might provide an incomplete representative picture of sustained attention.
Objective: This study aimed to develop an artificial intelligence (AI)-based Computerized Digit Vigilance Test (AI-CDVT) for older adults.
Methods: Participants were assessed by the CDVT with video recordings capturing their head and face. The Montreal Cognitive Assessment (MoCA), Stroop Color Word Test (SCW), and Color Trails Test (CTT) were also administered. The AI-CDVT was developed in three steps: (1) retrieving attentional features using OpenFace AI software (CMU MultiComp Lab), (2) establishing an AI-based scoring model with the Extreme Gradient Boosting regressor, and (3) assessing the AI-CDVT's validity by Pearson r values and test-retest reliability by intraclass correlation coefficients (ICCs).
Results: In total, 153 participants were included. Pearson r values of the AI-CDVT with the MoCA were -0.42, -0.31 with the SCW, and 0.46-0.61 with the CTT. The ICC of the AI-CDVT was 0.78.
Conclusions: We developed an AI-CDVT, which leveraged AI to extract attentional features from video recordings and integrated them to generate a comprehensive attention score. Our findings demonstrated good validity and test-retest reliability for the AI-CDVT, suggesting its potential as a reliable and valid tool for assessing sustained attention in older adults.
Background: Heart failure (HF) is a public health concern with a wider impact on quality of life and cost of care. One of the major challenges in HF is the higher rate of unplanned readmissions and suboptimal performance of models to predict the readmissions. Hence, in this study, we implemented embeddings-based approaches to generate features for improving model performance.
Objective: The objective of this study was to evaluate and compare the effectiveness of different feature embedding approaches for improving the prediction of unplanned readmissions in patients with heart failure.
Methods: We compared three embedding approaches including word2vec on terminology codes and concept unique identifier (CUIs) and BERT on descriptive text of concept with baseline (one hot-encoding). We compared area under the receiver operating characteristic (AUROC) and F1-scores for the logistic regression, eXtream gradient-boosting (XGBoost) and artificial neural network (ANN) models using these embedding approaches. The model was tested on the heart failure cohort (N=21,031) identified using least restrictive phenotyping methods from MIMIC-IV dataset.
Results: We found that the embedding approaches significantly improved the performance of the prediction models. The XGBoost performed better for all approaches. The word2vec embeddings (0.65) trained on the dataset outperformed embeddings from pre-trained BERT model (0.59) using descriptive text.
Conclusions: Embedding methods, particularly word2vec trained on electronic health record data, can better discriminate HF readmission cases compared to both one-hot encoding and pre-trained BERT embeddings on concept descriptions making it a viable approach of automation feature selection. The observed AUROC improvement (0.65 vs 0.54) may support more effective risk stratification and targeted clinical interventions.
Background: Electronic health records (EHRs) contain comprehensive information regarding diagnoses, clinical procedures, and prescribed medications. This makes them a valuable resource for developing automated hypertension medication recommendation systems. Within this field, existing research has used machine learning approaches, leveraging demographic characteristics and basic clinical indicators, or deep learning techniques, which extract patterns from EHR data, to predict optimal medications or improve the accuracy of recommendations for common antihypertensive medication categories. However, these methodologies have significant limitations. They rarely adequately characterize the synergistic relationships among heterogeneous medical entities, such as the interplay between comorbid conditions, laboratory results, and specific antihypertensive agents. Furthermore, given the chronic and fluctuating nature of hypertension, effective medication recommendations require dynamic adaptation to disease progression over time. However, current approaches either lack rigorous temporal modeling of EHR data or fail to effectively integrate temporal dynamics with interentity relationships, resulting in the generation of recommendations that are not clinically appropriate due to the neglect of these critical factors.
Objective: This study aims to overcome the challenges in existing methods and introduce a novel model for hypertension medication recommendation that leverages the synergy and selectivity of heterogeneous medical entities.
Methods: First, we used patient EHR data to construct both heterogeneous and homogeneous graphs. The interentity synergies were captured using a multihead graph attention mechanism to enhance entity-level representations. Next, a bidirectional temporal selection mechanism calculated selective coefficients between current and historical visit records and aggregated them to form refined visit-level representations. Finally, medication recommendation probabilities were determined based on these comprehensive patient representations.
Results: Experimental evaluations on the real-world datasets Medical Information Mart for Intensive Care (MIMIC)-III v1.4 and MIMIC-IV v2.2 demonstrated that the proposed model achieved Jaccard similarity coefficients of 58.01% and 55.82%, respectively; areas under the curve of precision-recall of 83.56% and 80.69%, respectively; and F1-scores of 68.95% and 64.83%, respectively, outperforming the baseline models.
Conclusions: The findings indicate the superior efficacy of the introduced model in medication recommendation, highlighting its potential to enhance clinical decision-making in the management of hypertension. The code for the model has been released on GitHub.
Background: Large language models (LLMs) have shown promise in reducing clinical documentation burden, yet their real-world implementation remains rare. Especially in South Korea, hospitals face several unique challenges, such as strict data sovereignty requirements and operating in environments where English is not the primary language for documentation. Therefore, we initiated the Your-Knowledgeable Navigator of Treatment (Y-KNOT) project, aimed at developing an on-premises bilingual LLM-based artificial intelligence (AI) agent system integrated with electronic health records (EHRs) for automated clinical drafting.
Objective: We present the Y-KNOT project and provide insights into implementing AI-assisted clinical drafting tools within constraints of health care system.
Methods: This project involved multiple stakeholders and encompassed three simultaneous processes: LLM development, clinical co-development, and EHR integration. We developed a foundation LLM by pretraining Llama3-8B with Korean and English medical corpora. During the clinical co-development phase, the LLM was instruction-tuned for specific documentation tasks through iterative cycles that aligned physicians' clinical requirements, hospital data availability, documentation standards, and technical feasibility. The EHR integration phase focused on seamless AI agent incorporation into clinical workflows, involving document standardization, trigger points definition, and user interaction optimization.
Unlabelled: The resulting system processes emergency department discharge summaries and preanesthetic assessments while maintaining existing clinical workflows. The drafting process is automatically triggered by specific events, such as scheduled batch jobs, with medical records automatically fed into the LLM as input. The agent is built on premises, locating all the architecture inside the hospital.
Conclusions: The Y-KNOT project demonstrates the first seamless integration of an AI agent into an EHR system for clinical drafting. In collaboration with various clinical and administrative teams, we could promptly implement an LLM while addressing key challenges of data security, bilingual requirements, and workflow integration. Our experience highlights a practical and scalable approach to utilizing LLM-based AI agents for other health care institutions, paving the way for broader adoption of LLM-based solutions.
Background: With the rapid development of artificial intelligence, large language models (LLMs) have shown strong capabilities in natural language understanding, reasoning, and generation, attracting much research interest in applying LLMs to health and medicine. Critical care medicine (CCM) provides diagnosis and treatment for patients with critical illness who often require intensive monitoring and interventions in intensive care units (ICUs). Whether LLMs can be applied to CCM, and whether they can operate as ICU experts in assisting clinical decision-making rather than "stochastic parrots," remains uncertain.
Objective: This scoping review aims to provide a panoramic portrait of the application of LLMs in CCM, identifying the advantages, challenges, and future potential of LLMs in this field.
Methods: This study was conducted in accordance with the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. Literature was searched across 7 databases, including PubMed, Embase, Scopus, Web of Science, CINAHL, IEEE Xplore, and ACM Digital Library, from the first available paper to August 22, 2025.
Results: From an initial 2342 retrieved papers, 41 were selected for final review. LLMs played an important role in CCM through the following 3 main channels: clinical decision support, medical documentation and reporting, and medical education and doctor-patient communication. Compared to traditional artificial intelligence models, LLMs have advantages in handling unstructured data and do not require manual feature engineering. Meanwhile, applying LLMs to CCM has faced challenges, including hallucinations and poor interpretability, sensitivity to prompts, bias and alignment challenges, and privacy and ethical issues.
Conclusions: Although LLMs are not yet ICU experts, they have the potential to become valuable tools in CCM, helping to improve patient outcomes and optimize health care delivery. Future research should enhance model reliability and interpretability, improve model training and deployment scalability, integrate up-to-date medical knowledge, and strengthen privacy and ethical guidelines, paving the way for LLMs to fully realize their impact in critical care.
Trial registration: OSF Registries yn328; https://osf.io/yn328/.
Background: Information security within telemedicine systems is essential to advancing the digital transformation of health care. Telemedicine encompasses diverse modalities, including teleconsultation, telehealth, and remote patient monitoring, all of which depend on digital platforms, secured communication networks, and internet-connected devices. Although these systems have progressed in aligning with information security standards and regulations, there remains a shortage of comprehensive, practice-oriented studies evaluating which aspects of security are effectively addressed and which remain insufficiently managed, particularly within the Chilean context.
Objective: This study aims to examine how effectively telemedicine systems in Chile address the core security attributes of confidentiality, availability, and integrity.
Methods: Data were analyzed from an evaluation tool designed to assess the quality of telemedicine systems in Chile. Over a 6-year period, 25 telemedicine systems from different providers were assessed, and an in-depth examination of how companies manage key information security subcharacteristics within their systems was undertaken.
Results: The findings indicate that 52% (n=13) of telemedicine systems optimally implement cryptographic techniques to protect confidentiality. In contrast, 44% (n=11) lack robust strategies for adapting to, recovering from, and mitigating security-related incidents. Fault tolerance mechanisms are frequently integrated to minimize service disruption caused by system failures. However, the prioritization of data integrity varies: while some companies treat it as a critical requirement, others assign it limited importance.
Conclusions: This study offers an understanding of the security priorities and practices adopted by telemedicine providers. It highlights a prevailing tendency to prioritize security measures over usability, underscoring the need for a balanced approach that safeguards patient information while supporting efficient clinical workflows.
Background: In Switzerland, sexual assault reports have historically been documented on paper, which limited standardization, completeness, and challenges to produce reliable statistics.
Objective: This study describes the development and implementation of an Electronic Sexual Assault Record (eSAR) within Geneva University Hospitals' Electronic Medical Record (EMR) system, with the aim of improving data quality, documentation, and multidisciplinary coordination.
Methods: The eSAR was developed by a multidisciplinary team including forensic doctors, gynecologists, nurses (clinical and informatics), epidemiologists, and IT specialists. Its structure was based on existing hospital protocols and international recommendations. Variables were defined as "essential" or "highly recommended," with structured fields to ensure completeness and comparability. Confidentiality was safeguarded through restricted access and regular audits.
Results: The eSAR was launched in June 2022 and revised in 2023 after user feedback and training. Since implementation, 382 reports have been completed. Data quality improved substantially, with major reductions in missing information. The system also streamlined workflows and strengthened collaboration across specialties.
Conclusions: The eSAR improved documentation and data reliability, providing a replicable model for standardized sexual assault reporting in Switzerland.

