JAMIA Open最新文献

Using artificial intelligence to expedite and enhance plain language summary abstract writing of scientific content.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-04-03 eCollection Date: 2025-04-01 DOI: 10.1093/jamiaopen/ooaf023

David McMinn, Tom Grant, Laura DeFord-Watts, Veronica Porkess, Margarita Lens, Christopher Rapier, Wilson Q Joe, Timothy A Becker, Walter Bender

Objective: To assess the capacity of a bespoke artificial intelligence (AI) process to help medical writers efficiently generate quality plain language summary abstracts (PLSAs).

Materials and methods: Three independent studies were conducted. In Studies 1 and 3, original scientific abstracts (OSAs; n = 48, n = 2) and corresponding PLSAs written by medical writers versus bespoke AI were assessed using standard readability metrics. Study 2 compared time and effort of medical writers (n = 10) drafting PLSAs starting with an OSA (n = 6) versus the output of 1 bespoke AI (n = 6) and 1 non-bespoke AI (n = 6) process. These PLSAs (n = 72) were assessed by subject matter experts (SMEs; n = 3) for accuracy and physicians (n = 7) for patient suitability. Lastly, in Study 3, medical writers (n = 22) and patients/patient advocates (n = 5) compared quality of medical writer and bespoke AI-generated PLSAs.

Results: In Study 1, bespoke AI PLSAs were easier to read than medical writer PLSAs across all readability metrics (P <.01). In Study 2, bespoke AI output saved medical writers >40% in time for PLSA creation and required less effort than unassisted writing. SME-assessed quality was higher for AI-assisted PLSAs, and physicians preferred bespoke AI-generated outputs for patient use. In Study 3, bespoke AI PLSAs were more readable and rated of higher quality than medical writer PLSAs.

Discussion: The bespoke AI process may enhance access to health information by helping medical writers produce PLSAs of scientific content that are fit for purpose.

Conclusion: The bespoke AI process can more efficiently create better quality, more readable first draft PLSAs versus medical writer-generated PLSAs.

{"title":"Using artificial intelligence to expedite and enhance plain language summary abstract writing of scientific content.","authors":"David McMinn, Tom Grant, Laura DeFord-Watts, Veronica Porkess, Margarita Lens, Christopher Rapier, Wilson Q Joe, Timothy A Becker, Walter Bender","doi":"10.1093/jamiaopen/ooaf023","DOIUrl":"10.1093/jamiaopen/ooaf023","url":null,"abstract":"Objective: To assess the capacity of a bespoke artificial intelligence (AI) process to help medical writers efficiently generate quality plain language summary abstracts (PLSAs).Materials and methods: Three independent studies were conducted. In Studies 1 and 3, original scientific abstracts (OSAs; n = 48, n = 2) and corresponding PLSAs written by medical writers versus bespoke AI were assessed using standard readability metrics. Study 2 compared time and effort of medical writers (n = 10) drafting PLSAs starting with an OSA (n = 6) versus the output of 1 bespoke AI (n = 6) and 1 non-bespoke AI (n = 6) process. These PLSAs (n = 72) were assessed by subject matter experts (SMEs; n = 3) for accuracy and physicians (n = 7) for patient suitability. Lastly, in Study 3, medical writers (n = 22) and patients/patient advocates (n = 5) compared quality of medical writer and bespoke AI-generated PLSAs.Results: In Study 1, bespoke AI PLSAs were easier to read than medical writer PLSAs across all readability metrics (P <.01). In Study 2, bespoke AI output saved medical writers >40% in time for PLSA creation and required less effort than unassisted writing. SME-assessed quality was higher for AI-assisted PLSAs, and physicians preferred bespoke AI-generated outputs for patient use. In Study 3, bespoke AI PLSAs were more readable and rated of higher quality than medical writer PLSAs.Discussion: The bespoke AI process may enhance access to health information by helping medical writers produce PLSAs of scientific content that are fit for purpose.Conclusion: The bespoke AI process can more efficiently create better quality, more readable first draft PLSAs versus medical writer-generated PLSAs.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf023"},"PeriodicalIF":2.5,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11967854/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143781285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computer-assisted prescription of erythropoiesis-stimulating agents in patients undergoing maintenance hemodialysis: a randomized control trial for artificial intelligence model selection.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-03-27 eCollection Date: 2025-04-01 DOI: 10.1093/jamiaopen/ooaf020

Lee-Moay Lim, Ming-Yen Lin, Chan Hsu, Chantung Ku, Yi-Pei Chen, Yihuang Kang, Yi-Wen Chiu

Objective: Machine learning (ML) algorithms are promising tools for managing anemia in hemodialysis (HD) patients. However, their efficacy in predicting erythropoiesis-stimulating agents (ESAs) doses remains uncertain. This study aimed to evaluate the effectiveness of a contemporary artificial intelligence (AI) model in prescribing ESA doses compared to physicians for HD patients.

Materials and methods: This double-blinded control trial randomized participants into traditional doctor (Dr) and AI groups. In the Dr group, doses of ESA were determined by following clinical guideline recommendations, while in the AI group, they were predicted by the developed models named Random effects (REEM) trees, Mixed-effect random forest (MERF), Long short-term memory (LSTM) networks-I, and LSTM-II. The primary outcome was the capability to maintain patients' hemoglobin (Hb) value near 11 g/dL with a margin of 0.25 g/dL after treating the suggested ESA, with the secondary outcome being Hb value between 10 and 12 g/dL.

Results: A total of 124 participants were enrolled, with 104 completing the study. The mean Hb values were 10.8 and 10.9 g/dL in the AI and Dr groups, respectively, with 69.7% and 73.5% of participants in the respective groups maintaining Hb levels between 10 and 12 g/dL. Only the REEM trees model passed the non-inferiority test for the primary outcome with a margin of 0.25 g/dL and the secondary outcome with a margin of 15%. There was no difference in severe adverse events between the 2 groups.

Conclusion: The REEM trees AI model demonstrated non-inferiority to physicians in prescribing ESA doses for HD patients, maintaining Hb levels within the therapeutic target.

Clinicaltrialsgov identifier: NCT04185519.

{"title":"Computer-assisted prescription of erythropoiesis-stimulating agents in patients undergoing maintenance hemodialysis: a randomized control trial for artificial intelligence model selection.","authors":"Lee-Moay Lim, Ming-Yen Lin, Chan Hsu, Chantung Ku, Yi-Pei Chen, Yihuang Kang, Yi-Wen Chiu","doi":"10.1093/jamiaopen/ooaf020","DOIUrl":"10.1093/jamiaopen/ooaf020","url":null,"abstract":"Objective: Machine learning (ML) algorithms are promising tools for managing anemia in hemodialysis (HD) patients. However, their efficacy in predicting erythropoiesis-stimulating agents (ESAs) doses remains uncertain. This study aimed to evaluate the effectiveness of a contemporary artificial intelligence (AI) model in prescribing ESA doses compared to physicians for HD patients.Materials and methods: This double-blinded control trial randomized participants into traditional doctor (Dr) and AI groups. In the Dr group, doses of ESA were determined by following clinical guideline recommendations, while in the AI group, they were predicted by the developed models named Random effects (REEM) trees, Mixed-effect random forest (MERF), Long short-term memory (LSTM) networks-I, and LSTM-II. The primary outcome was the capability to maintain patients' hemoglobin (Hb) value near 11 g/dL with a margin of 0.25 g/dL after treating the suggested ESA, with the secondary outcome being Hb value between 10 and 12 g/dL.Results: A total of 124 participants were enrolled, with 104 completing the study. The mean Hb values were 10.8 and 10.9 g/dL in the AI and Dr groups, respectively, with 69.7% and 73.5% of participants in the respective groups maintaining Hb levels between 10 and 12 g/dL. Only the REEM trees model passed the non-inferiority test for the primary outcome with a margin of 0.25 g/dL and the secondary outcome with a margin of 15%. There was no difference in severe adverse events between the 2 groups.Conclusion: The REEM trees AI model demonstrated non-inferiority to physicians in prescribing ESA doses for HD patients, maintaining Hb levels within the therapeutic target.Clinicaltrialsgov identifier: NCT04185519.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf020"},"PeriodicalIF":2.5,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11950923/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143755002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging natural language processing and machine learning to characterize psychological stress and life meaning and purpose in pediatric cancer survivors: a preliminary validation study.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-03-26 eCollection Date: 2025-04-01 DOI: 10.1093/jamiaopen/ooaf018

Jin-Ah Sim, Xiaolei Huang, Rachel T Webster, Kumar Srivastava, Kirsten K Ness, Melissa M Hudson, Justin N Baker, I-Chan Huang

Objective: To determine if natural language processing (NLP) and machine learning (ML) techniques accurately identify interview-based psychological stress and meaning/purpose data in child/adolescent cancer survivors.

Materials and methods: Interviews were conducted with 51 survivors (aged 8-17.9 years; ≥5-years post-therapy) from St Jude Children's Research Hospital. Two content experts coded 244 and 513 semantic units, focusing on attributes of psychological stress (anger, controllability/manageability, fear/anxiety) and attributes of meaning/purpose (goal, optimism, purpose). Content experts extracted specific attributes from the interviews, which were designated as the gold standard. Two NLP/ML methods, Word2Vec with Extreme Gradient Boosting (XGBoost), and Bidirectional Encoder Representations from Transformers Large (BERT_Large), were validated using accuracy, areas under the receiver operating characteristic curves (AUROCC), and under the precision-recall curves (AUPRC).

Results: BERT_Large demonstrated higher accuracy, AUROCC, and AUPRC in identifying all attributes of psychological stress and meaning/purpose versus Word2Vec/XGBoost. BERT_Large significantly outperformed Word2Vec/XGBoost in characterizing all attributes (P <.05) except for the purpose attribute of meaning/purpose.

Discussion: These findings suggest that AI tools can help healthcare providers efficiently assess emotional well-being of childhood cancer survivors, supporting future clinical interventions.

Conclusions: NLP/ML effectively identifies interview-based data for child/adolescent cancer survivors.

{"title":"Leveraging natural language processing and machine learning to characterize psychological stress and life meaning and purpose in pediatric cancer survivors: a preliminary validation study.","authors":"Jin-Ah Sim, Xiaolei Huang, Rachel T Webster, Kumar Srivastava, Kirsten K Ness, Melissa M Hudson, Justin N Baker, I-Chan Huang","doi":"10.1093/jamiaopen/ooaf018","DOIUrl":"10.1093/jamiaopen/ooaf018","url":null,"abstract":"Objective: To determine if natural language processing (NLP) and machine learning (ML) techniques accurately identify interview-based psychological stress and meaning/purpose data in child/adolescent cancer survivors.Materials and methods: Interviews were conducted with 51 survivors (aged 8-17.9 years; ≥5-years post-therapy) from St Jude Children's Research Hospital. Two content experts coded 244 and 513 semantic units, focusing on attributes of psychological stress (anger, controllability/manageability, fear/anxiety) and attributes of meaning/purpose (goal, optimism, purpose). Content experts extracted specific attributes from the interviews, which were designated as the gold standard. Two NLP/ML methods, Word2Vec with Extreme Gradient Boosting (XGBoost), and Bidirectional Encoder Representations from Transformers Large (BERTLarge), were validated using accuracy, areas under the receiver operating characteristic curves (AUROCC), and under the precision-recall curves (AUPRC).Results: BERTLarge demonstrated higher accuracy, AUROCC, and AUPRC in identifying all attributes of psychological stress and meaning/purpose versus Word2Vec/XGBoost. BERTLarge significantly outperformed Word2Vec/XGBoost in characterizing all attributes (P <.05) except for the purpose attribute of meaning/purpose.Discussion: These findings suggest that AI tools can help healthcare providers efficiently assess emotional well-being of childhood cancer survivors, supporting future clinical interventions.Conclusions: NLP/ML effectively identifies interview-based data for child/adolescent cancer survivors.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 2","pages":"ooaf018"},"PeriodicalIF":2.5,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11936487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143721728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring beyond diagnoses in electronic health records to improve discovery: a review of the phenome-wide association study.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-02-28 eCollection Date: 2025-02-01 DOI: 10.1093/jamiaopen/ooaf006

Nicholas C Wan, Monika E Grabowska, Vern Eric Kerchberger, Wei-Qi Wei

Objective: The phenome-wide association study (PheWAS) systematically examines the phenotypic spectrum extracted from electronic health records (EHRs) to uncover correlations between phenotypes and exposures. This review explores methodologies, highlights challenges, and outlines future directions for EHR-driven PheWAS.

Materials and methods: We searched the PubMed database for articles spanning from 2010 to 2023, and we collected data regarding exposures, phenotypes, cohorts, terminologies, replication, and ancestry.

Results: Our search yielded 690 articles. Following exclusion criteria, we identified 291 articles published between January 1, 2010, and December 31, 2023. A total number of 162 (55.6%) articles defined phenomes using phecodes, indicating that research is reliant on the organization of billing codes. Moreover, 72.8% of articles utilized exposures consisting of genetic data, and the majority (69.4%) of PheWAS lacked replication analyses.

Discussion: Existing literature underscores the need for deeper phenotyping, variability in PheWAS exposure variables, and absence of replication in PheWAS. Current applications of PheWAS mainly focus on cardiovascular, metabolic, and endocrine phenotypes; thus, applications of PheWAS in uncommon diseases, which may lack structured data, remain largely understudied.

Conclusions: With modern EHRs, future PheWAS should extend beyond diagnosis codes and consider additional data like clinical notes or medications to create comprehensive phenotype profiles that consider severity, temporality, risk, and ancestry. Furthermore, data interoperability initiatives may help mitigate the paucity of PheWAS replication analyses. With the growing availability of data in EHR, PheWAS will remain a powerful tool in precision medicine.

{"title":"Exploring beyond diagnoses in electronic health records to improve discovery: a review of the phenome-wide association study.","authors":"Nicholas C Wan, Monika E Grabowska, Vern Eric Kerchberger, Wei-Qi Wei","doi":"10.1093/jamiaopen/ooaf006","DOIUrl":"10.1093/jamiaopen/ooaf006","url":null,"abstract":"Objective: The phenome-wide association study (PheWAS) systematically examines the phenotypic spectrum extracted from electronic health records (EHRs) to uncover correlations between phenotypes and exposures. This review explores methodologies, highlights challenges, and outlines future directions for EHR-driven PheWAS.Materials and methods: We searched the PubMed database for articles spanning from 2010 to 2023, and we collected data regarding exposures, phenotypes, cohorts, terminologies, replication, and ancestry.Results: Our search yielded 690 articles. Following exclusion criteria, we identified 291 articles published between January 1, 2010, and December 31, 2023. A total number of 162 (55.6%) articles defined phenomes using phecodes, indicating that research is reliant on the organization of billing codes. Moreover, 72.8% of articles utilized exposures consisting of genetic data, and the majority (69.4%) of PheWAS lacked replication analyses.Discussion: Existing literature underscores the need for deeper phenotyping, variability in PheWAS exposure variables, and absence of replication in PheWAS. Current applications of PheWAS mainly focus on cardiovascular, metabolic, and endocrine phenotypes; thus, applications of PheWAS in uncommon diseases, which may lack structured data, remain largely understudied.Conclusions: With modern EHRs, future PheWAS should extend beyond diagnosis codes and consider additional data like clinical notes or medications to create comprehensive phenotype profiles that consider severity, temporality, risk, and ancestry. Furthermore, data interoperability initiatives may help mitigate the paucity of PheWAS replication analyses. With the growing availability of data in EHR, PheWAS will remain a powerful tool in precision medicine.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooaf006"},"PeriodicalIF":2.5,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11879097/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Toward digital caregiving network interventions for children with medical complexity living in socioeconomically disadvantaged neighborhoods.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-02-26 eCollection Date: 2025-02-01 DOI: 10.1093/jamiaopen/ooaf011

Nicole E Werner, Makenzie Morgen, Anna Jolliff, Madeline Kieren, Joanna Thomson, Scott Callahan, Neal deJong, Carolyn Foster, David Ming, Arielle Randolph, Christopher J Stille, Mary Ehlenbach, Barbara Katz, Ryan J Coller

Background: To be usable, useful, and sustainable for families of children with medically complex conditions (CMC), digital interventions must account for the complex sociotechnical context in which these families provide care. CMC experience higher neighborhood socioeconomic disadvantage than other child populations, which has associations with CMC health. Neighborhoods may influence the structure and function of the array of caregivers CMC depend upon (ie, the caregiving network).

Objective: Explore the structures/functions and barriers/facilitators of caregiving networks for CMC living in socioeconomically disadvantaged neighborhoods to inform the design of digital network interventions.

Methods: We conducted 6 virtual focus groups with caregivers of CMC living in socioeconomically disadvantaged neighborhoods from 6 sites. Three groups included "primary caregivers" (parent/guardian), and 3 groups included "secondary caregivers" (eg, other family member, in-home nurse). We analyzed transcripts using thematic analysis.

Results: Primary (n = 18) and secondary (n = 9) caregivers were most often female (81%) and reported a mean (SD) caregiving network size of 3.9 (1.60). We identified 4 themes to inform digital network intervention design: (1) Families vary in whether they prefer to be the locus of network communication, (2) external forces may override caregivers' communication preferences, (3) neighborhood assets influence caregiving network structure, and (4) unfilled or unreliably filled secondary caregiver roles creates vulnerability and greater demands on the primary caregiver.

Discussion and conclusion: Our results provide a foundation from which digital network interventions can be designed, highlighting that caregiving networks for CMC living in socioeconomically disadvantaged neighborhoods are influenced by family preferences, external forces, and neighborhood assets.

{"title":"Toward digital caregiving network interventions for children with medical complexity living in socioeconomically disadvantaged neighborhoods.","authors":"Nicole E Werner, Makenzie Morgen, Anna Jolliff, Madeline Kieren, Joanna Thomson, Scott Callahan, Neal deJong, Carolyn Foster, David Ming, Arielle Randolph, Christopher J Stille, Mary Ehlenbach, Barbara Katz, Ryan J Coller","doi":"10.1093/jamiaopen/ooaf011","DOIUrl":"10.1093/jamiaopen/ooaf011","url":null,"abstract":"Background: To be usable, useful, and sustainable for families of children with medically complex conditions (CMC), digital interventions must account for the complex sociotechnical context in which these families provide care. CMC experience higher neighborhood socioeconomic disadvantage than other child populations, which has associations with CMC health. Neighborhoods may influence the structure and function of the array of caregivers CMC depend upon (ie, the caregiving network).Objective: Explore the structures/functions and barriers/facilitators of caregiving networks for CMC living in socioeconomically disadvantaged neighborhoods to inform the design of digital network interventions.Methods: We conducted 6 virtual focus groups with caregivers of CMC living in socioeconomically disadvantaged neighborhoods from 6 sites. Three groups included \"primary caregivers\" (parent/guardian), and 3 groups included \"secondary caregivers\" (eg, other family member, in-home nurse). We analyzed transcripts using thematic analysis.Results: Primary (n = 18) and secondary (n = 9) caregivers were most often female (81%) and reported a mean (SD) caregiving network size of 3.9 (1.60). We identified 4 themes to inform digital network intervention design: (1) Families vary in whether they prefer to be the locus of network communication, (2) external forces may override caregivers' communication preferences, (3) neighborhood assets influence caregiving network structure, and (4) unfilled or unreliably filled secondary caregiver roles creates vulnerability and greater demands on the primary caregiver.Discussion and conclusion: Our results provide a foundation from which digital network interventions can be designed, highlighting that caregiving networks for CMC living in socioeconomically disadvantaged neighborhoods are influenced by family preferences, external forces, and neighborhood assets.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooaf011"},"PeriodicalIF":2.5,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11878567/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143558210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transforming appeal decisions: machine learning triage for hospital admission denials.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-02-25 eCollection Date: 2025-02-01 DOI: 10.1093/jamiaopen/ooaf016

Timothy Owolabi

Objective: To develop and validate a machine learning model that helps physician advisors efficiently identify hospital admission denials likely to be overturned on appeal.

Materials: Analysis of 2473 appealed hospital admission denials with known outcomes, split 90:10 for training and testing.

Methods: Six binary classifier models were trained and evaluated using accuracy, precision, recall, and F1 score metrics.

Results: An elastic net logistic regression model was selected based on computational efficiency and optimal performance with 84% accuracy, 84% precision, 98% recall, and an F1 score of 0.9.

Discussion: The predictive model addresses the risk of physician advisors accepting inappropriate denials due to biased perceptions of appeal success. Model implementation improved denial screening efficiency and was a key feature of a more successful appeal strategy.

Conclusions: By addressing data quality problems inherent to electronic health data, and expanding the feature space, machine learning can be an effective tool in the healthcare provider space.

{"title":"Transforming appeal decisions: machine learning triage for hospital admission denials.","authors":"Timothy Owolabi","doi":"10.1093/jamiaopen/ooaf016","DOIUrl":"10.1093/jamiaopen/ooaf016","url":null,"abstract":"Objective: To develop and validate a machine learning model that helps physician advisors efficiently identify hospital admission denials likely to be overturned on appeal.Materials: Analysis of 2473 appealed hospital admission denials with known outcomes, split 90:10 for training and testing.Methods: Six binary classifier models were trained and evaluated using accuracy, precision, recall, and F1 score metrics.Results: An elastic net logistic regression model was selected based on computational efficiency and optimal performance with 84% accuracy, 84% precision, 98% recall, and an F1 score of 0.9.Discussion: The predictive model addresses the risk of physician advisors accepting inappropriate denials due to biased perceptions of appeal success. Model implementation improved denial screening efficiency and was a key feature of a more successful appeal strategy.Conclusions: By addressing data quality problems inherent to electronic health data, and expanding the feature space, machine learning can be an effective tool in the healthcare provider space.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooaf016"},"PeriodicalIF":2.5,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11854074/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143504473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing clinical documentation with ambient artificial intelligence: a quality improvement survey assessing clinician perspectives on work burden, burnout, and job satisfaction.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-02-21 eCollection Date: 2025-02-01 DOI: 10.1093/jamiaopen/ooaf013

Michael Albrecht, Denton Shanks, Tina Shah, Taina Hudson, Jeffrey Thompson, Tanya Filardi, Kelli Wright, Gregory A Ator, Timothy Ryan Smith

Objective: This study evaluates the impact of an ambient artificial intelligence (AI) documentation platform on clinicians' perceptions of documentation workflow.

Materials and methods: An anonymous pre- and non-anonymous post-implementation survey evaluated ambulatory clinician perceptions on impact of Abridge, an ambient AI documentation platform. Outcomes included clinical documentation burden, work after-hours, clinician burnout, and work satisfaction. Data were analyzed using descriptive statistics and proportional odds logistic regression to compare changes for concordant questions across pre- and post-surveys. Covariate analysis examined effect of specialty type and duration of AI tool usage.

Results: Survey response rates were 51.9% (93/181) pre-implementation and 74.4% (99/133) post-implementation. Clinician perception of ease of documentation workflow (OR = 6.91, 95% CI: 3.90-12.56, P <.001) and in completing notes associated with usage of the AI tool (OR = 4.95, 95% CI: 2.87-8.69, P <.001) was significantly improved. Most respondents agreed that the AI tool decreased documentation burden, decreased the time spent documenting outside clinical hours, reduced burnout risk, and increased job satisfaction, with 48% agreeing that an additional patient could be seen if needed. Clinician specialty type and number of days using the AI tool did not significantly affect survey responses.

Discussion: Clinician experience and efficiency was improved with use of Abridge across a breadth of specialties.

Conclusion: An ambient AI documentation platform had tremendous impact on improving clinician experience within a short time frame. Future studies should utilize validated instruments for clinician efficiency and burnout and compare impact across AI platforms.

{"title":"Enhancing clinical documentation with ambient artificial intelligence: a quality improvement survey assessing clinician perspectives on work burden, burnout, and job satisfaction.","authors":"Michael Albrecht, Denton Shanks, Tina Shah, Taina Hudson, Jeffrey Thompson, Tanya Filardi, Kelli Wright, Gregory A Ator, Timothy Ryan Smith","doi":"10.1093/jamiaopen/ooaf013","DOIUrl":"10.1093/jamiaopen/ooaf013","url":null,"abstract":"Objective: This study evaluates the impact of an ambient artificial intelligence (AI) documentation platform on clinicians' perceptions of documentation workflow.Materials and methods: An anonymous pre- and non-anonymous post-implementation survey evaluated ambulatory clinician perceptions on impact of Abridge, an ambient AI documentation platform. Outcomes included clinical documentation burden, work after-hours, clinician burnout, and work satisfaction. Data were analyzed using descriptive statistics and proportional odds logistic regression to compare changes for concordant questions across pre- and post-surveys. Covariate analysis examined effect of specialty type and duration of AI tool usage.Results: Survey response rates were 51.9% (93/181) pre-implementation and 74.4% (99/133) post-implementation. Clinician perception of ease of documentation workflow (OR = 6.91, 95% CI: 3.90-12.56, P <.001) and in completing notes associated with usage of the AI tool (OR = 4.95, 95% CI: 2.87-8.69, P <.001) was significantly improved. Most respondents agreed that the AI tool decreased documentation burden, decreased the time spent documenting outside clinical hours, reduced burnout risk, and increased job satisfaction, with 48% agreeing that an additional patient could be seen if needed. Clinician specialty type and number of days using the AI tool did not significantly affect survey responses.Discussion: Clinician experience and efficiency was improved with use of Abridge across a breadth of specialties.Conclusion: An ambient AI documentation platform had tremendous impact on improving clinician experience within a short time frame. Future studies should utilize validated instruments for clinician efficiency and burnout and compare impact across AI platforms.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooaf013"},"PeriodicalIF":2.5,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11843214/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143484522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An empirical study of using radiology reports and images to improve intensive care unit mortality prediction. 利用放射学报告和图像改善重症监护室死亡率预测的实证研究。

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-02-20 eCollection Date: 2025-02-01 DOI: 10.1093/jamiaopen/ooae137

Mingquan Lin, Song Wang, Ying Ding, Lihui Zhao, Fei Wang, Yifan Peng

Objectives: The predictive intensive care unit (ICU) scoring system is crucial for predicting patient outcomes, particularly mortality. Traditional scoring systems rely mainly on structured clinical data from electronic health records, which can overlook important clinical information in narratives and images.

Materials and methods: In this work, we build a deep learning-based survival prediction model that utilizes multimodality data for ICU mortality prediction. Four sets of features are investigated: (1) physiological measurements of Simplified Acute Physiology Score (SAPS) II, (2) common thorax diseases predefined by radiologists, (3) bidirectional encoder representations from transformers-based text representations, and (4) chest X-ray image features. The model was evaluated using the Medical Information Mart for Intensive Care IV dataset.

Results: Our model achieves an average C-index of 0.7829 (95% CI, 0.7620-0.8038), surpassing the baseline using only SAPS-II features, which had a C-index of 0.7470 (95% CI: 0.7263-0.7676). Ablation studies further demonstrate the contributions of incorporating predefined labels (2.00% improvement), text features (2.44% improvement), and image features (2.82% improvement).

Discussion and conclusion: The deep learning model demonstrated superior performance to traditional machine learning methods under the same feature fusion setting for ICU mortality prediction. This study highlights the potential of integrating multimodal data into deep learning models to enhance the accuracy of ICU mortality prediction.

目的：重症监护室（ICU）预测性评分系统对预测患者预后，尤其是死亡率至关重要。传统的评分系统主要依赖电子健康记录中的结构化临床数据，这可能会忽略叙述和图像中的重要临床信息：在这项工作中，我们建立了一个基于深度学习的生存预测模型，利用多模态数据进行 ICU 死亡率预测。我们研究了四组特征：（1）简化急性生理学评分（SAPS）II 的生理测量；（2）放射科医生预先定义的常见胸部疾病；（3）基于转换器文本表示的双向编码器表示；以及（4）胸部 X 光图像特征。我们使用重症监护医学信息市场 IV 数据集对该模型进行了评估：结果：我们的模型平均 C 指数为 0.7829（95% CI，0.7620-0.8038），超过了仅使用 SAPS-II 特征的基线，后者的 C 指数为 0.7470（95% CI：0.7263-0.7676）。消融研究进一步证明了结合预定义标签（提高 2.00%）、文本特征（提高 2.44%）和图像特征（提高 2.82%）所做的贡献：在相同的特征融合设置下，深度学习模型在 ICU 死亡率预测方面的表现优于传统的机器学习方法。这项研究凸显了将多模态数据整合到深度学习模型中以提高 ICU 死亡率预测准确性的潜力。

{"title":"An empirical study of using radiology reports and images to improve intensive care unit mortality prediction.","authors":"Mingquan Lin, Song Wang, Ying Ding, Lihui Zhao, Fei Wang, Yifan Peng","doi":"10.1093/jamiaopen/ooae137","DOIUrl":"10.1093/jamiaopen/ooae137","url":null,"abstract":"Objectives: The predictive intensive care unit (ICU) scoring system is crucial for predicting patient outcomes, particularly mortality. Traditional scoring systems rely mainly on structured clinical data from electronic health records, which can overlook important clinical information in narratives and images.Materials and methods: In this work, we build a deep learning-based survival prediction model that utilizes multimodality data for ICU mortality prediction. Four sets of features are investigated: (1) physiological measurements of Simplified Acute Physiology Score (SAPS) II, (2) common thorax diseases predefined by radiologists, (3) bidirectional encoder representations from transformers-based text representations, and (4) chest X-ray image features. The model was evaluated using the Medical Information Mart for Intensive Care IV dataset.Results: Our model achieves an average C-index of 0.7829 (95% CI, 0.7620-0.8038), surpassing the baseline using only SAPS-II features, which had a C-index of 0.7470 (95% CI: 0.7263-0.7676). Ablation studies further demonstrate the contributions of incorporating predefined labels (2.00% improvement), text features (2.44% improvement), and image features (2.82% improvement).Discussion and conclusion: The deep learning model demonstrated superior performance to traditional machine learning methods under the same feature fusion setting for ICU mortality prediction. This study highlights the potential of integrating multimodal data into deep learning models to enhance the accuracy of ICU mortality prediction.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooae137"},"PeriodicalIF":2.5,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11841685/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143469420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VaxBot-HPV: a GPT-based chatbot for answering HPV vaccine-related questions.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-02-19 eCollection Date: 2025-02-01 DOI: 10.1093/jamiaopen/ooaf005

Yiming Li, Jianfu Li, Manqi Li, Evan Yu, Danniel Rhee, Muhammad Amith, Lu Tang, Lara S Savas, Licong Cui, Cui Tao

Objective: Human Papillomavirus (HPV) vaccine is an effective measure to prevent and control the diseases caused by HPV. However, widespread misinformation and vaccine hesitancy remain significant barriers to its uptake. This study focuses on the development of VaxBot-HPV, a chatbot aimed at improving health literacy and promoting vaccination uptake by providing information and answering questions about the HPV vaccine.

Methods: We constructed the knowledge base (KB) for VaxBot-HPV, which consists of 451 documents from biomedical literature and web sources on the HPV vaccine. We extracted 202 question-answer pairs from the KB and 39 questions generated by GPT-4 for training and testing purposes. To comprehensively understand the capabilities and potential of GPT-based chatbots, 3 models were involved in this study: GPT-3.5, VaxBot-HPV, and GPT-4. The evaluation criteria included answer relevancy and faithfulness.

Results: VaxBot-HPV demonstrated superior performance in answer relevancy and faithfulness compared to baselines. For test questions in KB, it achieved an answer relevancy score of 0.85 and a faithfulness score of 0.97. Similarly, it attained scores of 0.85 for answer relevancy and 0.96 for faithfulness on GPT-generated questions.

Discussion: VaxBot-HPV demonstrates the effectiveness of fine-tuned large language models in healthcare, outperforming generic GPT models in accuracy and relevance. Fine-tuning mitigates hallucinations and misinformation, ensuring reliable information on HPV vaccination while allowing dynamic and tailored responses. The specific fine-tuning, which includes context in addition to question-answer pairs, enables VaxBot-HPV to provide explanations and reasoning behind its answers, enhancing transparency and user trust.

Conclusions: This study underscores the importance of leveraging large language models and fine-tuning techniques in the development of chatbots for healthcare applications, with implications for improving medical education and public health communication.

{"title":"VaxBot-HPV: a GPT-based chatbot for answering HPV vaccine-related questions.","authors":"Yiming Li, Jianfu Li, Manqi Li, Evan Yu, Danniel Rhee, Muhammad Amith, Lu Tang, Lara S Savas, Licong Cui, Cui Tao","doi":"10.1093/jamiaopen/ooaf005","DOIUrl":"10.1093/jamiaopen/ooaf005","url":null,"abstract":"Objective: Human Papillomavirus (HPV) vaccine is an effective measure to prevent and control the diseases caused by HPV. However, widespread misinformation and vaccine hesitancy remain significant barriers to its uptake. This study focuses on the development of VaxBot-HPV, a chatbot aimed at improving health literacy and promoting vaccination uptake by providing information and answering questions about the HPV vaccine.Methods: We constructed the knowledge base (KB) for VaxBot-HPV, which consists of 451 documents from biomedical literature and web sources on the HPV vaccine. We extracted 202 question-answer pairs from the KB and 39 questions generated by GPT-4 for training and testing purposes. To comprehensively understand the capabilities and potential of GPT-based chatbots, 3 models were involved in this study: GPT-3.5, VaxBot-HPV, and GPT-4. The evaluation criteria included answer relevancy and faithfulness.Results: VaxBot-HPV demonstrated superior performance in answer relevancy and faithfulness compared to baselines. For test questions in KB, it achieved an answer relevancy score of 0.85 and a faithfulness score of 0.97. Similarly, it attained scores of 0.85 for answer relevancy and 0.96 for faithfulness on GPT-generated questions.Discussion: VaxBot-HPV demonstrates the effectiveness of fine-tuned large language models in healthcare, outperforming generic GPT models in accuracy and relevance. Fine-tuning mitigates hallucinations and misinformation, ensuring reliable information on HPV vaccination while allowing dynamic and tailored responses. The specific fine-tuning, which includes context in addition to question-answer pairs, enables VaxBot-HPV to provide explanations and reasoning behind its answers, enhancing transparency and user trust.Conclusions: This study underscores the importance of leveraging large language models and fine-tuning techniques in the development of chatbots for healthcare applications, with implications for improving medical education and public health communication.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooaf005"},"PeriodicalIF":2.5,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11837857/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143459111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AXpert: human expert facilitated privacy-preserving large language models for abdominal X-ray report labeling.

IF 2.5 Q2 HEALTH CARE SCIENCES & SERVICES

JAMIA Open

Pub Date : 2025-02-10 eCollection Date: 2025-02-01 DOI: 10.1093/jamiaopen/ooaf008

Yufeng Zhang, Joseph G Kohne, Katherine Webster, Rebecca Vartanian, Emily Wittrup, Kayvan Najarian

Importance: The lack of a publicly accessible abdominal X-ray (AXR) dataset has hindered necrotizing enterocolitis (NEC) research. While significant strides have been made in applying natural language processing (NLP) to radiology reports, most efforts have focused on chest radiology. Development of an accurate NLP model to identify features of NEC on abdominal radiograph can support efforts to improve diagnostic accuracy for this and other rare pediatric conditions.

Objectives: This study aims to develop privacy-preserving large language models (LLMs) and their distilled version to efficiently annotate pediatric AXR reports.

Materials and methods: Utilizing pediatric AXR reports collected from C.S. Mott Children's Hospital, we introduced AXpert in 2 formats: one based on the instruction-fine-tuned 7-B Gemma model, and a distilled version employing a BERT-based model derived from the fine-tuned model to improve inference and fine-tuning efficiency. AXpert aims to detect NEC presence and classify its subtypes-pneumatosis, portal venous gas, and free air.

Results: Extensive testing shows that LLMs, including Axpert, outperforms baseline BERT models on all metrics. Specifically, Gemma-7B (F1 score: 0.9 ± 0.015) improves upon BlueBERT by 132% in F1 score for detecting NEC positive samples. The distilled BERT model matches the performance of the LLM labelers and surpasses expert-trained baseline BERT models.

Discussion: Our findings highlight the potential of using LLMs for clinical NLP tasks. With minimal expert knowledge injections, LLMs can achieve human-like performance, greatly reducing manual labor. Privacy concerns are alleviated as all models are trained and deployed locally.

Conclusion: AXpert demonstrates potential to reduce human labeling efforts while maintaining high accuracy in automating NEC diagnosis with AXR, offering precise image labeling capabilities.

{"title":"AXpert: human expert facilitated privacy-preserving large language models for abdominal X-ray report labeling.","authors":"Yufeng Zhang, Joseph G Kohne, Katherine Webster, Rebecca Vartanian, Emily Wittrup, Kayvan Najarian","doi":"10.1093/jamiaopen/ooaf008","DOIUrl":"10.1093/jamiaopen/ooaf008","url":null,"abstract":"Importance: The lack of a publicly accessible abdominal X-ray (AXR) dataset has hindered necrotizing enterocolitis (NEC) research. While significant strides have been made in applying natural language processing (NLP) to radiology reports, most efforts have focused on chest radiology. Development of an accurate NLP model to identify features of NEC on abdominal radiograph can support efforts to improve diagnostic accuracy for this and other rare pediatric conditions.Objectives: This study aims to develop privacy-preserving large language models (LLMs) and their distilled version to efficiently annotate pediatric AXR reports.Materials and methods: Utilizing pediatric AXR reports collected from C.S. Mott Children's Hospital, we introduced AXpert in 2 formats: one based on the instruction-fine-tuned 7-B Gemma model, and a distilled version employing a BERT-based model derived from the fine-tuned model to improve inference and fine-tuning efficiency. AXpert aims to detect NEC presence and classify its subtypes-pneumatosis, portal venous gas, and free air.Results: Extensive testing shows that LLMs, including Axpert, outperforms baseline BERT models on all metrics. Specifically, Gemma-7B (F1 score: 0.9 ± 0.015) improves upon BlueBERT by 132% in F1 score for detecting NEC positive samples. The distilled BERT model matches the performance of the LLM labelers and surpasses expert-trained baseline BERT models.Discussion: Our findings highlight the potential of using LLMs for clinical NLP tasks. With minimal expert knowledge injections, LLMs can achieve human-like performance, greatly reducing manual labor. Privacy concerns are alleviated as all models are trained and deployed locally.Conclusion: AXpert demonstrates potential to reduce human labeling efforts while maintaining high accuracy in automating NEC diagnosis with AXR, offering precise image labeling capabilities.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"8 1","pages":"ooaf008"},"PeriodicalIF":2.5,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143392140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0