Pub Date : 2025-11-27eCollection Date: 2025-12-01DOI: 10.1093/jamiaopen/ooaf142
Madhavi Pagare, Deva Sai Kumar Bheesetti, Inyene Essien-Aleksi, Mohammad Arif Ul Alam
Objective: Identifying social determinants of mental health (SDOMH) in patients with opioid use disorder (OUD) is crucial for estimating risk and enabling early intervention. Extracting such data from unstructured clinical notes is challenging due to annotation complexity and requires advanced natural language processing (NLP) techniques. We propose the Human-in-the-Loop Large Language Model Interaction for Annotation (HLLIA) framework, combined with a Multilevel Hierarchical Clinical-Longformer Embedding (MHCLE) algorithm, to annotate and predict SDOMH variables.
Materials and methods: We utilized 2636 annotated discharge summaries from the Medical Information Mart for Intensive Care (MIMIC-IV) dataset. High-quality annotations were ensured via a human-in-the-loop approach, refined using large language models (LLMs). The MHCLE algorithm performed multi-label classification of 13 SDOMH variables and was evaluated against baseline models, including RoBERTa, Bio_ClinicalBERT, ClinicalBERT, and ClinicalBigBird.
Results: The MHCLE model achieved superior performance with 96.29% accuracy and a 95.41% F1score, surpassing baseline models. Training-testing policies P1, P2, and P3 yielded accuracies of 98.49%, 90.10%, and 89.04%, respectively, highlighting the importance of human intervention in refining LLM annotations.
Discussion and conclusion: Integrating the MHCLE model with the HLLIA framework offers an effective approach for predicting SDOMH factors from clinical notes, advancing NLP in OUD care. It highlights the importance of human oversight and sets a benchmark for future research.
{"title":"Augmenting large language models to predict social determinants of mental health in opioid use disorder using patient clinical notes.","authors":"Madhavi Pagare, Deva Sai Kumar Bheesetti, Inyene Essien-Aleksi, Mohammad Arif Ul Alam","doi":"10.1093/jamiaopen/ooaf142","DOIUrl":"10.1093/jamiaopen/ooaf142","url":null,"abstract":"<p><strong>Objective: </strong>Identifying social determinants of mental health (SDOMH) in patients with opioid use disorder (OUD) is crucial for estimating risk and enabling early intervention. Extracting such data from unstructured clinical notes is challenging due to annotation complexity and requires advanced natural language processing (NLP) techniques. We propose the Human-in-the-Loop Large Language Model Interaction for Annotation (HLLIA) framework, combined with a Multilevel Hierarchical Clinical-Longformer Embedding (MHCLE) algorithm, to annotate and predict SDOMH variables.</p><p><strong>Materials and methods: </strong>We utilized 2636 annotated discharge summaries from the Medical Information Mart for Intensive Care (MIMIC-IV) dataset. High-quality annotations were ensured via a human-in-the-loop approach, refined using large language models (LLMs). The MHCLE algorithm performed multi-label classification of 13 SDOMH variables and was evaluated against baseline models, including RoBERTa, Bio_ClinicalBERT, ClinicalBERT, and ClinicalBigBird.</p><p><strong>Results: </strong>The MHCLE model achieved superior performance with 96.29% accuracy and a 95.41% F1score, surpassing baseline models. Training-testing policies P1, P2, and P3 yielded accuracies of 98.49%, 90.10%, and 89.04%, respectively, highlighting the importance of human intervention in refining LLM annotations.</p><p><strong>Discussion and conclusion: </strong>Integrating the MHCLE model with the HLLIA framework offers an effective approach for predicting SDOMH factors from clinical notes, advancing NLP in OUD care. It highlights the importance of human oversight and sets a benchmark for future research.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf142"},"PeriodicalIF":3.4,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12664681/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145649331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26eCollection Date: 2025-12-01DOI: 10.1093/jamiaopen/ooaf157
Eun Hye Jang, Javier Aguirre, Sangji Lee, Hyeyoon Moon, Won Chul Cha
Objective: This study aims to develop and validate of a span-based annotation framework for clinical named entity recognition (NER) using large language models (LLMs) based on Korean emergency department clinical notes.
Materials and methods: Two datasets with the same entity types but different annotation spans (word- vs phrase-level) were constructed, with the phrase-level dataset further was expanded into a doubled version. A Korean language-specific LLM was fine-tuned on each dataset, producing three variants that were compared with two baseline models, few-shot LLM and fine-tuned small language model (SLM). The final variant fine-tuned on the doubled phrase-level dataset was further evaluated against a human annotator.
Results: In all experimental settings, three variants outperformed the baselines by achieving the highest F1 scores across all metrics. The final variant achieved F1 scores exceeding 0.80 across all averaging strategies and evaluation metrics, including token-based, span-based exact, and span-based partial evaluations demonstrating its robustness applicable in a practical setting.
Discussion: While prompt engineering with few-shot is widely adopted for LLM-based clinical NER, our results proved that supervised fine-tuning (SFT) is consistently superior. The final variant outperformed the human annotator, emphasizing its potential as an automatic labeling tool.
Conclusion: This study introduced a novel span-based annotation framework for LLM-based clinical NER verified by three independent experiments. In multilingual and real-world clinical settings, LLMs have proven in handling complex entity spans that include word-level and phrase-level annotations, particularly for long and attribute-rich entities.
{"title":"Span-based annotation framework for LLM-based clinical named entity recognition: development and validation using Korean emergency department notes.","authors":"Eun Hye Jang, Javier Aguirre, Sangji Lee, Hyeyoon Moon, Won Chul Cha","doi":"10.1093/jamiaopen/ooaf157","DOIUrl":"10.1093/jamiaopen/ooaf157","url":null,"abstract":"<p><strong>Objective: </strong>This study aims to develop and validate of a span-based annotation framework for clinical named entity recognition (NER) using large language models (LLMs) based on Korean emergency department clinical notes.</p><p><strong>Materials and methods: </strong>Two datasets with the same entity types but different annotation spans (word- vs phrase-level) were constructed, with the phrase-level dataset further was expanded into a doubled version. A Korean language-specific LLM was fine-tuned on each dataset, producing three variants that were compared with two baseline models, few-shot LLM and fine-tuned small language model (SLM). The final variant fine-tuned on the doubled phrase-level dataset was further evaluated against a human annotator.</p><p><strong>Results: </strong>In all experimental settings, three variants outperformed the baselines by achieving the highest F1 scores across all metrics. The final variant achieved F1 scores exceeding 0.80 across all averaging strategies and evaluation metrics, including token-based, span-based exact, and span-based partial evaluations demonstrating its robustness applicable in a practical setting.</p><p><strong>Discussion: </strong>While prompt engineering with few-shot is widely adopted for LLM-based clinical NER, our results proved that supervised fine-tuning (SFT) is consistently superior. The final variant outperformed the human annotator, emphasizing its potential as an automatic labeling tool.</p><p><strong>Conclusion: </strong>This study introduced a novel span-based annotation framework for LLM-based clinical NER verified by three independent experiments. In multilingual and real-world clinical settings, LLMs have proven in handling complex entity spans that include word-level and phrase-level annotations, particularly for long and attribute-rich entities.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf157"},"PeriodicalIF":3.4,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12657458/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145649357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-22eCollection Date: 2025-12-01DOI: 10.1093/jamiaopen/ooaf159
April S Liang, Shivam Vedak, Alex Dussaq, Dong-Han Yao, Joshua A Villarreal, Sijo Thomas, Nicholas Chen, Tanya Townsend, Natalie M Pageler, Keith Morse
Objectives: This study describes the utilization and experiences of artificial intelligence (AI)-generated draft responses to patient messages in pediatric ambulatory clinicians and contextualizes their experiences in relation to those of adult specialty clinicians.
Materials and methods: A prospective pilot was conducted from September 2023 to August 2024 in 2 pediatric clinics (General Pediatric and Adolescent Medicine) and 2 obstetric clinics (Reproductive Endocrinology and Infertility and General Obstetrics) within an academic health system in Northern California. Participants included physician, nurse, and medical assistant volunteers. The intervention involved a feature utilizing large language models embedded in the electronic health record to generate draft responses. Proportion of AI-generated draft used was collected, as were prepilot and follow-up surveys.
Results: A total of 61 clinicians (26 pediatric, 35 obstetric) enrolled, with 46 (75%) completing both surveys. Pediatric clinicians utilized 13.3% (95% CI, 12.3%-14.4%) of AI-generated drafts, and usage rates when responding to patients vs their proxies was similar (15% vs 12.9%, P = .24). Despite using AI-generated drafts significantly less than obstetric clinicians (18.3% [17.2%-19.5%], P < .0001), pediatric clinicians reported a significant reduction in perceived task load (NASA Task Load Index: 59.9-50.9, P = .04) and were more likely to recommend the tool (LTR: 7.0 vs 5.2, P = .04).
Discussion and conclusion: Pediatric clinicians used AI-generated drafts at a rate within previously reported ranges in adult specialties and experienced utility. These findings suggest this tool has potential for enhancing efficiency and reducing task load in pediatric care.
目的:本研究描述了人工智能(AI)生成的患者信息回复草案在儿科门诊临床医生中的应用和经验,并将他们的经验与成人专科临床医生的经验联系起来。材料和方法:前瞻性试点于2023年9月至2024年8月在北加州学术卫生系统内的2家儿科诊所(普通儿科和青少年医学)和2家产科诊所(生殖内分泌和不孕症和普通产科)进行。参与者包括医生、护士和医疗助理志愿者。干预措施包括利用嵌入在电子健康记录中的大型语言模型来生成回复草稿的功能。收集使用人工智能生成的草稿的比例,以及预试点和后续调查。结果:共有61名临床医生(26名儿科医生,35名产科医生)入组,其中46名(75%)完成了两项调查。儿科临床医生使用了13.3% (95% CI, 12.3%-14.4%)的人工智能生成的草稿,在回应患者和他们的代理时,使用率相似(15%对12.9%,P = .24)。尽管使用人工智能生成的草稿明显少于产科医生(18.3% [17.2%-19.5%]),P P =。04)并且更有可能推荐该工具(LTR: 7.0 vs 5.2, P = .04)。讨论和结论:儿科临床医生使用人工智能生成的草稿的比率在先前报道的成人专科和经验丰富的实用范围内。这些发现表明,该工具具有提高儿科护理效率和减少任务负荷的潜力。
{"title":"Artificial intelligence-generated draft replies to patient messages in pediatrics.","authors":"April S Liang, Shivam Vedak, Alex Dussaq, Dong-Han Yao, Joshua A Villarreal, Sijo Thomas, Nicholas Chen, Tanya Townsend, Natalie M Pageler, Keith Morse","doi":"10.1093/jamiaopen/ooaf159","DOIUrl":"10.1093/jamiaopen/ooaf159","url":null,"abstract":"<p><strong>Objectives: </strong>This study describes the utilization and experiences of artificial intelligence (AI)-generated draft responses to patient messages in pediatric ambulatory clinicians and contextualizes their experiences in relation to those of adult specialty clinicians.</p><p><strong>Materials and methods: </strong>A prospective pilot was conducted from September 2023 to August 2024 in 2 pediatric clinics (General Pediatric and Adolescent Medicine) and 2 obstetric clinics (Reproductive Endocrinology and Infertility and General Obstetrics) within an academic health system in Northern California. Participants included physician, nurse, and medical assistant volunteers. The intervention involved a feature utilizing large language models embedded in the electronic health record to generate draft responses. Proportion of AI-generated draft used was collected, as were prepilot and follow-up surveys.</p><p><strong>Results: </strong>A total of 61 clinicians (26 pediatric, 35 obstetric) enrolled, with 46 (75%) completing both surveys. Pediatric clinicians utilized 13.3% (95% CI, 12.3%-14.4%) of AI-generated drafts, and usage rates when responding to patients vs their proxies was similar (15% vs 12.9%, <i>P</i> = .24). Despite using AI-generated drafts significantly less than obstetric clinicians (18.3% [17.2%-19.5%], <i>P</i> < .0001), pediatric clinicians reported a significant reduction in perceived task load (NASA Task Load Index: 59.9-50.9, <i>P</i> = .04) and were more likely to recommend the tool (LTR: 7.0 vs 5.2, <i>P</i> = .04).</p><p><strong>Discussion and conclusion: </strong>Pediatric clinicians used AI-generated drafts at a rate within previously reported ranges in adult specialties and experienced utility. These findings suggest this tool has potential for enhancing efficiency and reducing task load in pediatric care.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 6","pages":"ooaf159"},"PeriodicalIF":3.4,"publicationDate":"2025-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12643547/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145606516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-10-01DOI: 10.1093/jamiaopen/ooaf140
Daniel R Harris, Nicholas Anthony, Kelly A Keyes, Chris Delcher
Objective: Medical examiners and coroners (ME/C) oversee medicolegal death investigations which determine causes of death and other contextual factors that may have influenced a death. We utilize open data releases from ME/C offices covering 6 different geographic areas to demonstrate the strengths and limitations of ME/C data for forensic epidemiology research.
Materials and methods: We use our novel geoPIPE tool to establish a pipeline that (a) automates ingesting open data releases, (b) geocodes records where possible to yield a spatial component, (c) enhances data with variables useful for overdose research, such as flagging substances contributing to each death, and (d) publishes the enriched data to our open repository. We use results from this pipeline to highlight similarities and differences of overdose data across different sources.
Results: Text processing to extract drugs contributing to each death yielded compatible data across all locations. Conversely, geospatial analyses are sometimes incompatible due to differences in available geographic resolution, which range from fine-grain latitude and longitude coordinates to larger regions identified by zip codes. Our pipeline pushes weekly results to an open repository.
Discussion: Open ME/C data are highly useful for research on substance use disorders; our visualizations demonstrate the ability to contextualize overdose data within and across specific geographic regions. Furthermore, the spatial component of our results enables clustering of overdose events and accessibility studies for resources related to preventing overdose deaths.
Conclusions: Given the utility to public health researchers, we advocate that other ME/C offices explore releasing open data and for policy makers to support and fund transparency efforts.
{"title":"Mapping the overdose crisis: 6 locations using open medical examiner data.","authors":"Daniel R Harris, Nicholas Anthony, Kelly A Keyes, Chris Delcher","doi":"10.1093/jamiaopen/ooaf140","DOIUrl":"10.1093/jamiaopen/ooaf140","url":null,"abstract":"<p><strong>Objective: </strong>Medical examiners and coroners (ME/C) oversee medicolegal death investigations which determine causes of death and other contextual factors that may have influenced a death. We utilize open data releases from ME/C offices covering 6 different geographic areas to demonstrate the strengths and limitations of ME/C data for forensic epidemiology research.</p><p><strong>Materials and methods: </strong>We use our novel geoPIPE tool to establish a pipeline that (a) automates ingesting open data releases, (b) geocodes records where possible to yield a spatial component, (c) enhances data with variables useful for overdose research, such as flagging substances contributing to each death, and (d) publishes the enriched data to our open repository. We use results from this pipeline to highlight similarities and differences of overdose data across different sources.</p><p><strong>Results: </strong>Text processing to extract drugs contributing to each death yielded compatible data across all locations. Conversely, geospatial analyses are sometimes incompatible due to differences in available geographic resolution, which range from fine-grain latitude and longitude coordinates to larger regions identified by zip codes. Our pipeline pushes weekly results to an open repository.</p><p><strong>Discussion: </strong>Open ME/C data are highly useful for research on substance use disorders; our visualizations demonstrate the ability to contextualize overdose data within and across specific geographic regions. Furthermore, the spatial component of our results enables clustering of overdose events and accessibility studies for resources related to preventing overdose deaths.</p><p><strong>Conclusions: </strong>Given the utility to public health researchers, we advocate that other ME/C offices explore releasing open data and for policy makers to support and fund transparency efforts.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 5","pages":"ooaf140"},"PeriodicalIF":3.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574786/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145431958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-10-01DOI: 10.1093/jamiaopen/ooaf141
Jimin J Lee, Eva Filosa, Tiphaine Pierson, Ninh Khuong, Camille Gagnon, Jennie Herbin, Soham Rej, Claire Godard-Sebillotte, Robyn Tamblyn, Todd C Lee, Emily G McDonald
Background: Deprescribing is the clinically supervised process of stopping or reducing medications that are no longer beneficial. MedSafer is an electronic decision support tool that guides healthcare providers (HCPs) through the deprescribing process. We recently developed a novel patient-facing version of the software, allowing patients and caregivers to generate a personalized deprescribing report to bring to their prescriber.
Objective: The study aimed to evaluate the usability and acceptability of MedSafer among older adults, caregivers, and community HCPs (physicians, nurse practitioners and pharmacists).
Method: A mixed-methods feasibility study was conducted with a convenience sample of 100 older adults/caregivers, and 25 healthcare practitioners. Participants were invited to test MedSafer and answer telephone or electronic surveys via RedCap. The Extended Technology Acceptance Model (TAM2) and System Usability Scale (SUS) were used for evaluation. A semi-structured interview was also conducted with a subset of participants (5 per group) who were selected on a volunteer basis, and thematic analysis was used following Braun & Clarke's approach.
Results: Healthcare providers scored more favorably on TAM2 constructs such as perceived usefulness (PU) (median: 4.25 for HCPs; 3.75 for caregivers; 3.00 for patients), and SUS compared to patients and caregivers (mean: 79.50 for HCPs; 52.95 for caregivers; 55.75 for patients). Thematic analysis revealed that participants recognized MedSafer as an empowering tool but noted the need for some usability improvements.
Conclusion: MedSafer is a promising tool to support deprescribing conversations. Enhancing usability, accessibility, and patient education may improve adoption.
{"title":"Assessing the acceptability and usability of MedSafer, a patient-centered electronic deprescribing tool.","authors":"Jimin J Lee, Eva Filosa, Tiphaine Pierson, Ninh Khuong, Camille Gagnon, Jennie Herbin, Soham Rej, Claire Godard-Sebillotte, Robyn Tamblyn, Todd C Lee, Emily G McDonald","doi":"10.1093/jamiaopen/ooaf141","DOIUrl":"10.1093/jamiaopen/ooaf141","url":null,"abstract":"<p><strong>Background: </strong>Deprescribing is the clinically supervised process of stopping or reducing medications that are no longer beneficial. MedSafer is an electronic decision support tool that guides healthcare providers (HCPs) through the deprescribing process. We recently developed a novel patient-facing version of the software, allowing patients and caregivers to generate a personalized deprescribing report to bring to their prescriber.</p><p><strong>Objective: </strong>The study aimed to evaluate the usability and acceptability of MedSafer among older adults, caregivers, and community HCPs (physicians, nurse practitioners and pharmacists).</p><p><strong>Method: </strong>A mixed-methods feasibility study was conducted with a convenience sample of 100 older adults/caregivers, and 25 healthcare practitioners. Participants were invited to test MedSafer and answer telephone or electronic surveys via RedCap. The Extended Technology Acceptance Model (TAM2) and System Usability Scale (SUS) were used for evaluation. A semi-structured interview was also conducted with a subset of participants (5 per group) who were selected on a volunteer basis, and thematic analysis was used following Braun & Clarke's approach.</p><p><strong>Results: </strong>Healthcare providers scored more favorably on TAM2 constructs such as perceived usefulness (PU) (median: 4.25 for HCPs; 3.75 for caregivers; 3.00 for patients), and SUS compared to patients and caregivers (mean: 79.50 for HCPs; 52.95 for caregivers; 55.75 for patients). Thematic analysis revealed that participants recognized MedSafer as an empowering tool but noted the need for some usability improvements.</p><p><strong>Conclusion: </strong>MedSafer is a promising tool to support deprescribing conversations. Enhancing usability, accessibility, and patient education may improve adoption.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 5","pages":"ooaf141"},"PeriodicalIF":3.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574792/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-10-01DOI: 10.1093/jamiaopen/ooaf139
Anna Patruno, Michael-Owen Panzarella, Michael Buckley, Milena Silverman, Evelyn Salazar, Renata Panchal, Joseph Lengfellner, Alexia Iasonos, Maryam Garza, Byeong Yeob Choi, Meredith Zozus, Stephanie Terzulli, Paul Sabbatini
Introduction: Clinical trial data is still predominantly manually entered by site staff into Electronic Data Capture (EDC) systems. This process of abstracting and manually transcribing patient data is time-consuming, inefficient and error prone. Use of Electronic Health Record to Electronic Data Capture (EHR-To-EDC) technologies that digitize this process would improve these inefficiencies.
Objectives: This study measured the impact of EHR-To-EDC technology on the data entry workflow of clinical trial data managers. The primary objective was to compare the speed and accuracy of the EHR-To-EDC enabled data entry method to the traditional, manual method. The secondary objective was to measure end user satisfaction.
Materials and methods: Five data managers ranging in experience from 9 months to over 2 years, were assigned an investigator-initiated, Memorial Sloan Kettering-sponsored oncology study within their disease area of expertise. Each data manager performed one-hour of manual data entry, and a week later, one-hour of data entry using IgniteData's EHR-To-EDC solution, Archer, on a predetermined set of patients, timepoints and data domains (labs, vitals). The data entered into the EDC were compared side-by-side and used to evaluate the speed and accuracy of the EHR-To-EDC enabled method versus traditional, manual data entry. A user satisfaction survey using a 5-point Likert scale was used to collect feedback regarding the selected platform's learnability, ease of use, perceived time savings, perceived efficiency, and preference over the manual method.
Results: The EHR-To-EDC method resulted in 58% more data entered versus the manual method (difference, 1745 data points; manual, 3023 data points; EHR-To-EDC, 4768 data points). The number of data entry errors was reduced by 99% (manual, 100 data points; EHR-To-EDC, 1 data point). Regarding user satisfaction, data managers either agreed or strongly agreed that the EHR-To-EDC workflow was easy to learn (5/5), easy to use (4.6/5), saved time (5/5), was more efficient (4.8/5), and preferred it over the manual entry workflow (4/5).
Conclusion: EHR-To-EDC enabled data entry increases data manager productivity, reduces errors and is preferred by data managers over manual data entry.
{"title":"Evaluating the Impact of Electronic Health Record to Electronic Data Capture Technology on Workflow Efficiency: a Site Perspective.","authors":"Anna Patruno, Michael-Owen Panzarella, Michael Buckley, Milena Silverman, Evelyn Salazar, Renata Panchal, Joseph Lengfellner, Alexia Iasonos, Maryam Garza, Byeong Yeob Choi, Meredith Zozus, Stephanie Terzulli, Paul Sabbatini","doi":"10.1093/jamiaopen/ooaf139","DOIUrl":"10.1093/jamiaopen/ooaf139","url":null,"abstract":"<p><strong>Introduction: </strong>Clinical trial data is still predominantly manually entered by site staff into Electronic Data Capture (EDC) systems. This process of abstracting and manually transcribing patient data is time-consuming, inefficient and error prone. Use of Electronic Health Record to Electronic Data Capture (EHR-To-EDC) technologies that digitize this process would improve these inefficiencies.</p><p><strong>Objectives: </strong>This study measured the impact of EHR-To-EDC technology on the data entry workflow of clinical trial data managers. The primary objective was to compare the speed and accuracy of the EHR-To-EDC enabled data entry method to the traditional, manual method. The secondary objective was to measure end user satisfaction.</p><p><strong>Materials and methods: </strong>Five data managers ranging in experience from 9 months to over 2 years, were assigned an investigator-initiated, Memorial Sloan Kettering-sponsored oncology study within their disease area of expertise. Each data manager performed one-hour of manual data entry, and a week later, one-hour of data entry using IgniteData's EHR-To-EDC solution, Archer, on a predetermined set of patients, timepoints and data domains (labs, vitals). The data entered into the EDC were compared side-by-side and used to evaluate the speed and accuracy of the EHR-To-EDC enabled method versus traditional, manual data entry. A user satisfaction survey using a 5-point Likert scale was used to collect feedback regarding the selected platform's learnability, ease of use, perceived time savings, perceived efficiency, and preference over the manual method.</p><p><strong>Results: </strong>The EHR-To-EDC method resulted in 58% more data entered versus the manual method (difference, 1745 data points; manual, 3023 data points; EHR-To-EDC, 4768 data points). The number of data entry errors was reduced by 99% (manual, 100 data points; EHR-To-EDC, 1 data point). Regarding user satisfaction, data managers either agreed or strongly agreed that the EHR-To-EDC workflow was easy to learn (5/5), easy to use (4.6/5), saved time (5/5), was more efficient (4.8/5), and preferred it over the manual entry workflow (4/5).</p><p><strong>Conclusion: </strong>EHR-To-EDC enabled data entry increases data manager productivity, reduces errors and is preferred by data managers over manual data entry.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 5","pages":"ooaf139"},"PeriodicalIF":3.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574785/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145431546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-10-01DOI: 10.1093/jamiaopen/ooaf135
Meghan E McGrady, Kevin A Hommel, Constance A Mara, Gabriella Breen, Michal Kouril
Objective: To engage end-users to develop and evaluate an algorithm to convert electronic adherence monitoring device (EAMD) output into the adherence data required for analyses.
Materials and methods: This study included 4 phases. First, process mapping interviews and focus groups were conducted to identify rules for EAMD data processing and user needs. Second, algorithm parameters required to compute daily adherence values were defined and coded in an R package (OncMAP). Third, algorithm-produced data were compared to manually recoded data to evaluate the algorithm's sensitivity, specificity, and accuracy. Finally, pilot testing was conducted to obtain feedback on the perceived value/benefit of the algorithm and features that should be considered during software development.
Results: EAMD data processing rules were identified and coded in an R application. The algorithm correctly classified all complete observations with 100% sensitivity and specificity. The receiver operating characteristic curve analysis yielded an area under the curve of 1.00. All pilot testing participants expressed interest in using the algorithm (Net Promoter Score = 71%) but identified several features essential for inclusion in the software package to ensure widespread adoption.
Discussion: The decision rules implemented to process EAMD actuation data can be parameterized to develop an algorithm to automate this process. The algorithm demonstrated high sensitivity, specificity, and accuracy. End-users were enthusiastic about the product and provided insights to inform the development of a software package including the algorithm.
Conclusion: A rule-based algorithm can accurately process EAMD actuation data and has the potential to improve the rigor and pace of adherence science.
{"title":"Engaging end-users to develop a novel algorithm to process electronic medication adherence monitoring device data.","authors":"Meghan E McGrady, Kevin A Hommel, Constance A Mara, Gabriella Breen, Michal Kouril","doi":"10.1093/jamiaopen/ooaf135","DOIUrl":"10.1093/jamiaopen/ooaf135","url":null,"abstract":"<p><strong>Objective: </strong>To engage end-users to develop and evaluate an algorithm to convert electronic adherence monitoring device (EAMD) output into the adherence data required for analyses.</p><p><strong>Materials and methods: </strong>This study included 4 phases. First, process mapping interviews and focus groups were conducted to identify rules for EAMD data processing and user needs. Second, algorithm parameters required to compute daily adherence values were defined and coded in an R package (OncMAP). Third, algorithm-produced data were compared to manually recoded data to evaluate the algorithm's sensitivity, specificity, and accuracy. Finally, pilot testing was conducted to obtain feedback on the perceived value/benefit of the algorithm and features that should be considered during software development.</p><p><strong>Results: </strong>EAMD data processing rules were identified and coded in an R application. The algorithm correctly classified all complete observations with 100% sensitivity and specificity. The receiver operating characteristic curve analysis yielded an area under the curve of 1.00. All pilot testing participants expressed interest in using the algorithm (Net Promoter Score = 71%) but identified several features essential for inclusion in the software package to ensure widespread adoption.</p><p><strong>Discussion: </strong>The decision rules implemented to process EAMD actuation data can be parameterized to develop an algorithm to automate this process. The algorithm demonstrated high sensitivity, specificity, and accuracy. End-users were enthusiastic about the product and provided insights to inform the development of a software package including the algorithm.</p><p><strong>Conclusion: </strong>A rule-based algorithm can accurately process EAMD actuation data and has the potential to improve the rigor and pace of adherence science.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 5","pages":"ooaf135"},"PeriodicalIF":3.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574791/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-10-01DOI: 10.1093/jamiaopen/ooaf125
Anna E Burns, John Tumberger, Mariah Brewe, Michael Bartkoski, Stephani L Stancil
Objectives: Describing the development of a visual dashboard leveraging available tools for efficient recruitment for patient centered clinical trials in resource constrained settings.
Materials and methods: A real-time, visual dashboard was developed, facilitating interactive visualizations, detailed analyses, and data quality control. Daily automated REDCap data retrieval occurred via an R program using REDCap API and output was integrated into Power BI. An interrupted time series analysis was conducted evaluating effects of dashboard on clinical trial recruitment metrics.
Results: The visual dashboard displayed key recruitment metrics, including individual participant progression and recruitment trends over time. Interrupted time series analysis showed improvements in screening rates upon implementation. The mean time to study completion decreased by 19 days following implementation.
Discussion: Customizable metrics offer comprehensive view of recruitment data and granularity, identifying actionable issues, enhancing study timeliness and completion.
Conclusion: Clinical trials of all budgets can integrate dashboards for real-time monitoring and data driven improvements to promote more timely completion.
{"title":"Meeting clinical recruitment milestones in an academic center: a data-driven, visual approach.","authors":"Anna E Burns, John Tumberger, Mariah Brewe, Michael Bartkoski, Stephani L Stancil","doi":"10.1093/jamiaopen/ooaf125","DOIUrl":"10.1093/jamiaopen/ooaf125","url":null,"abstract":"<p><strong>Objectives: </strong>Describing the development of a visual dashboard leveraging available tools for efficient recruitment for patient centered clinical trials in resource constrained settings.</p><p><strong>Materials and methods: </strong>A real-time, visual dashboard was developed, facilitating interactive visualizations, detailed analyses, and data quality control. Daily automated REDCap data retrieval occurred via an R program using REDCap API and output was integrated into Power BI. An interrupted time series analysis was conducted evaluating effects of dashboard on clinical trial recruitment metrics.</p><p><strong>Results: </strong>The visual dashboard displayed key recruitment metrics, including individual participant progression and recruitment trends over time. Interrupted time series analysis showed improvements in screening rates upon implementation. The mean time to study completion decreased by 19 days following implementation.</p><p><strong>Discussion: </strong>Customizable metrics offer comprehensive view of recruitment data and granularity, identifying actionable issues, enhancing study timeliness and completion.</p><p><strong>Conclusion: </strong>Clinical trials of all budgets can integrate dashboards for real-time monitoring and data driven improvements to promote more timely completion.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 5","pages":"ooaf125"},"PeriodicalIF":3.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574790/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145431951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-10-01DOI: 10.1093/jamiaopen/ooaf136
Megan E Salwei, Sharon E Davis, Carrie Reale, Laurie L Novak, Colin G Walsh, Russ Beebe, Scott Nelson, Sameer Sundrani, Susannah Rose, Adam Wright, Michael Ripperger, Peter Shave, Peter Embí
Objectives: As the use of artificial intelligence (AI) in healthcare is rapidly expanding, there is also growing recognition of the need for ongoing monitoring of AI after implementation, called algorithmovigilance. Yet, there remain few systems that support systematic monitoring and governance of AI used across a health system. In this study, we identify end-user needs for a novel AI monitoring system-the Vanderbilt Algorithmovigilance Monitoring and Operations System (VAMOS)-using human-centered design (HCD).
Materials and methods: We assembled a multidisciplinary team to plan AI monitoring and governance at Vanderbilt University Medical Center. We then conducted 9 participatory design sessions with diverse stakeholders to develop prototypes of VAMOS. Once we had a working prototype, we conducted 8 formative design interviews with key stakeholders to gather feedback on the system. We analyzed the interviews using a rapid qualitative analysis approach and revised the mock-ups. We then conducted a multidisciplinary heuristic evaluation to identify further improvements to the tool.
Results: Through an iterative, HCD process that engaged diverse end-users, we identified key components needed in AI monitoring systems. We identified specific data views and functionality required by end users across several user interfaces including a performance monitoring dashboard, accordion snapshots, and model-specific pages.
Discussion: We distilled general design requirements for systems to support AI monitoring throughout its lifecycle. One important consideration is how to support teams of health system leaders, clinical experts, and technical personnel that are distributed across the organization as they monitor and respond to algorithm deterioration.
Conclusion: VAMOS aims to support systematic and proactive monitoring of AI tools in healthcare organizations. Our findings and recommendations can support the design of AI monitoring systems to support health systems, improve quality of care, and ensure patient safety.
{"title":"Human-centered design of an artificial intelligence monitoring system: the Vanderbilt Algorithmovigilance Monitoring and Operations System.","authors":"Megan E Salwei, Sharon E Davis, Carrie Reale, Laurie L Novak, Colin G Walsh, Russ Beebe, Scott Nelson, Sameer Sundrani, Susannah Rose, Adam Wright, Michael Ripperger, Peter Shave, Peter Embí","doi":"10.1093/jamiaopen/ooaf136","DOIUrl":"10.1093/jamiaopen/ooaf136","url":null,"abstract":"<p><strong>Objectives: </strong>As the use of artificial intelligence (AI) in healthcare is rapidly expanding, there is also growing recognition of the need for ongoing monitoring of AI after implementation, called <i>algorithmovigilance</i>. Yet, there remain few systems that support systematic monitoring and governance of AI used across a health system. In this study, we identify end-user needs for a novel AI monitoring system-the Vanderbilt Algorithmovigilance Monitoring and Operations System (VAMOS)-using human-centered design (HCD).</p><p><strong>Materials and methods: </strong>We assembled a multidisciplinary team to plan AI monitoring and governance at Vanderbilt University Medical Center. We then conducted 9 participatory design sessions with diverse stakeholders to develop prototypes of VAMOS. Once we had a working prototype, we conducted 8 formative design interviews with key stakeholders to gather feedback on the system. We analyzed the interviews using a rapid qualitative analysis approach and revised the mock-ups. We then conducted a multidisciplinary heuristic evaluation to identify further improvements to the tool.</p><p><strong>Results: </strong>Through an iterative, HCD process that engaged diverse end-users, we identified key components needed in AI monitoring systems. We identified specific data views and functionality required by end users across several user interfaces including a performance monitoring dashboard, accordion snapshots, and model-specific pages.</p><p><strong>Discussion: </strong>We distilled general design requirements for systems to support AI monitoring throughout its lifecycle. One important consideration is how to support teams of health system leaders, clinical experts, and technical personnel that are distributed across the organization as they monitor and respond to algorithm deterioration.</p><p><strong>Conclusion: </strong>VAMOS aims to support systematic and proactive monitoring of AI tools in healthcare organizations. Our findings and recommendations can support the design of AI monitoring systems to support health systems, improve quality of care, and ensure patient safety.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 5","pages":"ooaf136"},"PeriodicalIF":3.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574793/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145431970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30eCollection Date: 2025-10-01DOI: 10.1093/jamiaopen/ooaf103
Kurmanbek Kaiyrbekov, Nicholas J Dobbins, Sean D Mooney
Objectives: Phone surveys are crucial for collecting health data but are expensive, time-consuming, and difficult to scale. To overcome these limitations, we propose a survey collection approach powered by conversational Large Language Models (LLMs).
Materials and methods: Our framework leverages an LLM-powered conversational agent to conduct surveys and transcribe conversations, along with an LLM (GPT-4o) to extract responses from the transcripts. We evaluated the framework's performance by analyzing transcription errors, the accuracy of inferred survey responses, and participant experiences across 40 survey responses collected from a convenience sample of 8 individuals, each adopting the role of five LLM-generated personas.
Results: GPT-4o extracted responses to survey questions with an average accuracy of 98%, despite an average transcription word error rate of 7.7%. Participants reported occasional errors by the conversational agent but praised its ability to demonstrate comprehension and maintain engaging conversations.
Discussion and conclusion: Our study showcases the potential of LLM agents to enable scalable, AI-powered phone surveys, reducing human effort and advancing healthcare data collection.
{"title":"Automated survey collection with LLM-based conversational agents.","authors":"Kurmanbek Kaiyrbekov, Nicholas J Dobbins, Sean D Mooney","doi":"10.1093/jamiaopen/ooaf103","DOIUrl":"10.1093/jamiaopen/ooaf103","url":null,"abstract":"<p><strong>Objectives: </strong>Phone surveys are crucial for collecting health data but are expensive, time-consuming, and difficult to scale. To overcome these limitations, we propose a survey collection approach powered by conversational Large Language Models (LLMs).</p><p><strong>Materials and methods: </strong>Our framework leverages an LLM-powered conversational agent to conduct surveys and transcribe conversations, along with an LLM (GPT-4o) to extract responses from the transcripts. We evaluated the framework's performance by analyzing transcription errors, the accuracy of inferred survey responses, and participant experiences across 40 survey responses collected from a convenience sample of 8 individuals, each adopting the role of five LLM-generated personas.</p><p><strong>Results: </strong>GPT-4o extracted responses to survey questions with an average accuracy of 98%, despite an average transcription word error rate of 7.7%. Participants reported occasional errors by the conversational agent but praised its ability to demonstrate comprehension and maintain engaging conversations.</p><p><strong>Discussion and conclusion: </strong>Our study showcases the potential of LLM agents to enable scalable, AI-powered phone surveys, reducing human effort and advancing healthcare data collection.</p>","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 5","pages":"ooaf103"},"PeriodicalIF":3.4,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12574787/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}