Anna Northrop, Anika Christofferson, Saumya Umashankar, Michelle Melisko, Paolo Castillo, Thelma Brown, Diane Heditsian, Susie Brain, Carol Simmons, Tina Hieken, Kathryn J Ruddy, Candace Mainor, Anosheh Afghahi, Sarah Tevis, Anne Blaes, Irene Kang, Adam Asare, Laura Esserman, Dawn L Hershman, Amrita Basu
Objectives: We describe the development and implementation of a system for monitoring patient-reported adverse events and quality of life using electronic Patient Reported Outcome (ePRO) instruments in the I-SPY2 Trial, a phase II clinical trial for locally advanced breast cancer. We describe the administration of technological, workflow, and behavior change interventions and their associated impact on questionnaire completion.
Materials and methods: Using the OpenClinica electronic data capture system, we developed rules-based logic to build automated ePRO surveys, customized to the I-SPY2 treatment schedule. We piloted ePROs at the University of California, San Francisco (UCSF) to optimize workflow in the context of trial treatment scenarios and staggered rollout of the ePRO system to 26 sites to ensure effective implementation of the technology.
Results: Increasing ePRO completion requires workflow solutions and research staff engagement. Over two years, we increased baseline survey completion from 25% to 80%. The majority of patients completed between 30% and 75% of the questionnaires they received, with no statistically significant variation in survey completion by age, race or ethnicity. Patients who completed the screening timepoint questionnaire were significantly more likely to complete more of the surveys they received at later timepoints (mean completion of 74.1% vs 35.5%, P < .0001). Baseline PROMIS social functioning and grade 2 or more PRO-CTCAE interference of Abdominal Pain, Decreased Appetite, Dizziness and Shortness of Breath was associated with lower survey completion rates.
Discussion and conclusion: By implementing ePROs, we have the potential to increase efficiency and accuracy of patient-reported clinical trial data collection, while improving quality of care, patient safety, and health outcomes. Our method is accessible across demographics and facilitates an ease of data collection and sharing across nationwide sites. We identify predictors of decreased completion that can optimize resource allocation by better targeting efforts such as in-person outreach, staff engagement, a robust technical workflow, and increased monitoring to improve overall completion rates.
{"title":"Implementation and impact of an electronic patient reported outcomes system in a phase II multi-site adaptive platform clinical trial for early-stage breast cancer.","authors":"Anna Northrop, Anika Christofferson, Saumya Umashankar, Michelle Melisko, Paolo Castillo, Thelma Brown, Diane Heditsian, Susie Brain, Carol Simmons, Tina Hieken, Kathryn J Ruddy, Candace Mainor, Anosheh Afghahi, Sarah Tevis, Anne Blaes, Irene Kang, Adam Asare, Laura Esserman, Dawn L Hershman, Amrita Basu","doi":"10.1093/jamia/ocae190","DOIUrl":"https://doi.org/10.1093/jamia/ocae190","url":null,"abstract":"<p><strong>Objectives: </strong>We describe the development and implementation of a system for monitoring patient-reported adverse events and quality of life using electronic Patient Reported Outcome (ePRO) instruments in the I-SPY2 Trial, a phase II clinical trial for locally advanced breast cancer. We describe the administration of technological, workflow, and behavior change interventions and their associated impact on questionnaire completion.</p><p><strong>Materials and methods: </strong>Using the OpenClinica electronic data capture system, we developed rules-based logic to build automated ePRO surveys, customized to the I-SPY2 treatment schedule. We piloted ePROs at the University of California, San Francisco (UCSF) to optimize workflow in the context of trial treatment scenarios and staggered rollout of the ePRO system to 26 sites to ensure effective implementation of the technology.</p><p><strong>Results: </strong>Increasing ePRO completion requires workflow solutions and research staff engagement. Over two years, we increased baseline survey completion from 25% to 80%. The majority of patients completed between 30% and 75% of the questionnaires they received, with no statistically significant variation in survey completion by age, race or ethnicity. Patients who completed the screening timepoint questionnaire were significantly more likely to complete more of the surveys they received at later timepoints (mean completion of 74.1% vs 35.5%, P < .0001). Baseline PROMIS social functioning and grade 2 or more PRO-CTCAE interference of Abdominal Pain, Decreased Appetite, Dizziness and Shortness of Breath was associated with lower survey completion rates.</p><p><strong>Discussion and conclusion: </strong>By implementing ePROs, we have the potential to increase efficiency and accuracy of patient-reported clinical trial data collection, while improving quality of care, patient safety, and health outcomes. Our method is accessible across demographics and facilitates an ease of data collection and sharing across nationwide sites. We identify predictors of decreased completion that can optimize resource allocation by better targeting efforts such as in-person outreach, staff engagement, a robust technical workflow, and increased monitoring to improve overall completion rates.</p><p><strong>Trial registration: </strong>https://clinicaltrials.gov/study/NCT01042379.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142001162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: Evaluation of crowdsourced mortality prediction models as a framework for assessing artificial intelligence in medicine.","authors":"","doi":"10.1093/jamia/ocae219","DOIUrl":"https://doi.org/10.1093/jamia/ocae219","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Guide, Shawn Garbett, Xiaoke Feng, Brandy M Mapes, Justin Cook, Lina Sulieman, Robert M Cronin, Qingxia Chen
Importance: Scales often arise from multi-item questionnaires, yet commonly face item non-response. Traditional solutions use weighted mean (WMean) from available responses, but potentially overlook missing data intricacies. Advanced methods like multiple imputation (MI) address broader missing data, but demand increased computational resources. Researchers frequently use survey data in the All of Us Research Program (All of Us), and it is imperative to determine if the increased computational burden of employing MI to handle non-response is justifiable.
Objectives: Using the 5-item Physical Activity Neighborhood Environment Scale (PANES) in All of Us, this study assessed the tradeoff between efficacy and computational demands of WMean, MI, and inverse probability weighting (IPW) when dealing with item non-response.
Materials and methods: Synthetic missingness, allowing 1 or more item non-response, was introduced into PANES across 3 missing mechanisms and various missing percentages (10%-50%). Each scenario compared WMean of complete questions, MI, and IPW on bias, variability, coverage probability, and computation time.
Results: All methods showed minimal biases (all <5.5%) for good internal consistency, with WMean suffered most with poor consistency. IPW showed considerable variability with increasing missing percentage. MI required significantly more computational resources, taking >8000 and >100 times longer than WMean and IPW in full data analysis, respectively.
Discussion and conclusion: The marginal performance advantages of MI for item non-response in highly reliable scales do not warrant its escalated cloud computational burden in All of Us, particularly when coupled with computationally demanding post-imputation analyses. Researchers using survey scales with low missingness could utilize WMean to reduce computing burden.
重要性:量表通常由多项目问卷产生,但通常面临项目无响应的问题。传统的解决方案使用现有回答的加权平均值(WMean),但可能会忽略缺失数据的复杂性。多重估算(MI)等先进方法可以解决更广泛的缺失数据问题,但需要更多的计算资源。研究人员经常在 "我们所有人 "研究计划(All of Us)中使用调查数据,因此必须确定采用多重归因法处理非响应所增加的计算负担是否合理:本研究使用 All of Us 中的 5 项体育活动邻里环境量表 (PANES),评估了 WMean、MI 和反概率加权 (IPW) 在处理项目无响应时的功效和计算需求之间的权衡:在 PANES 中引入了 3 种缺失机制和不同缺失百分比(10%-50%)的合成缺失,允许 1 个或多个项目无响应。每种情况都比较了完整问题、MI 和 IPW 对偏差、变异性、覆盖概率和计算时间的影响:结果:所有方法都显示出最小偏差(在完整数据分析中分别比 WMean 和 IPW 长 8000 倍和 100 倍以上):在高可靠性量表中,MI 对项目无响应的性能优势微乎其微,但这并不能证明其在 "我们所有人 "中云计算负担的增加是值得的,尤其是在与计算要求极高的输入后分析相结合的情况下。使用低缺失率调查量表的研究人员可以利用 WMean 来减轻计算负担。
{"title":"Balancing efficacy and computational burden: weighted mean, multiple imputation, and inverse probability weighting methods for item non-response in reliable scales.","authors":"Andrew Guide, Shawn Garbett, Xiaoke Feng, Brandy M Mapes, Justin Cook, Lina Sulieman, Robert M Cronin, Qingxia Chen","doi":"10.1093/jamia/ocae217","DOIUrl":"https://doi.org/10.1093/jamia/ocae217","url":null,"abstract":"<p><strong>Importance: </strong>Scales often arise from multi-item questionnaires, yet commonly face item non-response. Traditional solutions use weighted mean (WMean) from available responses, but potentially overlook missing data intricacies. Advanced methods like multiple imputation (MI) address broader missing data, but demand increased computational resources. Researchers frequently use survey data in the All of Us Research Program (All of Us), and it is imperative to determine if the increased computational burden of employing MI to handle non-response is justifiable.</p><p><strong>Objectives: </strong>Using the 5-item Physical Activity Neighborhood Environment Scale (PANES) in All of Us, this study assessed the tradeoff between efficacy and computational demands of WMean, MI, and inverse probability weighting (IPW) when dealing with item non-response.</p><p><strong>Materials and methods: </strong>Synthetic missingness, allowing 1 or more item non-response, was introduced into PANES across 3 missing mechanisms and various missing percentages (10%-50%). Each scenario compared WMean of complete questions, MI, and IPW on bias, variability, coverage probability, and computation time.</p><p><strong>Results: </strong>All methods showed minimal biases (all <5.5%) for good internal consistency, with WMean suffered most with poor consistency. IPW showed considerable variability with increasing missing percentage. MI required significantly more computational resources, taking >8000 and >100 times longer than WMean and IPW in full data analysis, respectively.</p><p><strong>Discussion and conclusion: </strong>The marginal performance advantages of MI for item non-response in highly reliable scales do not warrant its escalated cloud computational burden in All of Us, particularly when coupled with computationally demanding post-imputation analyses. Researchers using survey scales with low missingness could utilize WMean to reduce computing burden.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141977130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Izabelle Humes, Cathy Shyr, Moira Dillon, Zhongjie Liu, Jennifer Peterson, Chris St Jeor, Jacqueline Malkes, Hiral Master, Brandy Mapes, Romuladus Azuine, Nakia Mack, Bassent Abdelbary, Joyonna Gamble-George, Emily Goldmann, Stephanie Cook, Fatemeh Choupani, Rubin Baskir, Sydney McMaster, Chris Lunt, Karriem Watson, Minnkyong Lee, Sophie Schwartz, Ruchi Munshi, David Glazer, Eric Banks, Anthony Philippakis, Melissa Basford, Dan Roden, Paul A Harris
Objectives: The All of Us Research Program is a precision medicine initiative aimed at establishing a vast, diverse biomedical database accessible through a cloud-based data analysis platform, the Researcher Workbench (RW). Our goal was to empower the research community by co-designing the implementation of SAS in the RW alongside researchers to enable broader use of All of Us data.
Materials and methods: Researchers from various fields and with different SAS experience levels participated in co-designing the SAS implementation through user experience interviews.
Results: Feedback and lessons learned from user testing informed the final design of the SAS application.
Discussion: The co-design approach is critical for reducing technical barriers, broadening All of Us data use, and enhancing the user experience for data analysis on the RW.
Conclusion: Our co-design approach successfully tailored the implementation of the SAS application to researchers' needs. This approach may inform future software implementations on the RW.
目标:我们所有人研究计划是一项精准医学计划,旨在建立一个庞大、多样的生物医学数据库,可通过基于云的数据分析平台--研究者工作台(RW)进行访问。我们的目标是通过与研究人员共同设计 RW 中 SAS 的实施来增强研究社区的能力,从而更广泛地使用 All of Us 数据:来自不同领域、具有不同 SAS 经验水平的研究人员通过用户体验访谈参与了 SAS 实施的共同设计:结果:从用户测试中获得的反馈和经验教训为 SAS 应用程序的最终设计提供了依据:讨论:共同设计方法对于减少技术障碍、扩大 "我们所有人 "数据的使用范围以及增强用户在 RW 上进行数据分析的体验至关重要:我们的共同设计方法成功地使 SAS 应用程序的实施符合研究人员的需求。这种方法可为未来在 RW 上实施软件提供参考。
{"title":"Empowering the biomedical research community: Innovative SAS deployment on the All of Us Researcher Workbench.","authors":"Izabelle Humes, Cathy Shyr, Moira Dillon, Zhongjie Liu, Jennifer Peterson, Chris St Jeor, Jacqueline Malkes, Hiral Master, Brandy Mapes, Romuladus Azuine, Nakia Mack, Bassent Abdelbary, Joyonna Gamble-George, Emily Goldmann, Stephanie Cook, Fatemeh Choupani, Rubin Baskir, Sydney McMaster, Chris Lunt, Karriem Watson, Minnkyong Lee, Sophie Schwartz, Ruchi Munshi, David Glazer, Eric Banks, Anthony Philippakis, Melissa Basford, Dan Roden, Paul A Harris","doi":"10.1093/jamia/ocae216","DOIUrl":"10.1093/jamia/ocae216","url":null,"abstract":"<p><strong>Objectives: </strong>The All of Us Research Program is a precision medicine initiative aimed at establishing a vast, diverse biomedical database accessible through a cloud-based data analysis platform, the Researcher Workbench (RW). Our goal was to empower the research community by co-designing the implementation of SAS in the RW alongside researchers to enable broader use of All of Us data.</p><p><strong>Materials and methods: </strong>Researchers from various fields and with different SAS experience levels participated in co-designing the SAS implementation through user experience interviews.</p><p><strong>Results: </strong>Feedback and lessons learned from user testing informed the final design of the SAS application.</p><p><strong>Discussion: </strong>The co-design approach is critical for reducing technical barriers, broadening All of Us data use, and enhancing the user experience for data analysis on the RW.</p><p><strong>Conclusion: </strong>Our co-design approach successfully tailored the implementation of the SAS application to researchers' needs. This approach may inform future software implementations on the RW.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141972205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objective: The Australian Cancer Atlas (ACA) aims to provide small-area estimates of cancer incidence and survival in Australia to help identify and address geographical health disparities. We report on the 21-month user-centered design study to visualize the data, in particular, the visualization of the estimate uncertainty for multiple audiences.
Materials and methods: The preliminary phases included a scoping study, literature review, and target audience focus groups. Several methods were used to reach the wide target audience. The design and development stage included digital prototyping in parallel with Bayesian model development. Feedback was sought from multiple workshops, audience focus groups, and regular meetings throughout with an expert external advisory group.
Results: The initial scoping identified 4 target audience groups: the general public, researchers, health practitioners, and policy makers. These target groups were consulted throughout the project to ensure the developed model and uncertainty visualizations were effective for communication. In this paper, we detail ACA features and design iterations, including the 3 complementary ways in which uncertainty is communicated: the wave plot, the v-plot, and color transparency.
Discussion: We reflect on the methods, design iterations, decision-making process, and document lessons learned for future atlases.
Conclusion: The ACA has been hugely successful since launching in 2018. It has received over 62 000 individual users from over 100 countries and across all target audiences. It has been replicated in other countries and the second version of the ACA was launched in May 2024. This paper provides rich documentation for future projects.
{"title":"Designing the Australian Cancer Atlas: visualizing geostatistical model uncertainty for multiple audiences.","authors":"Sarah Goodwin, Thom Saunders, Joanne Aitken, Peter Baade, Upeksha Chandrasiri, Dianne Cook, Susanna Cramb, Earl Duncan, Stephanie Kobakian, Jessie Roberts, Kerrie Mengersen","doi":"10.1093/jamia/ocae212","DOIUrl":"https://doi.org/10.1093/jamia/ocae212","url":null,"abstract":"<p><strong>Objective: </strong>The Australian Cancer Atlas (ACA) aims to provide small-area estimates of cancer incidence and survival in Australia to help identify and address geographical health disparities. We report on the 21-month user-centered design study to visualize the data, in particular, the visualization of the estimate uncertainty for multiple audiences.</p><p><strong>Materials and methods: </strong>The preliminary phases included a scoping study, literature review, and target audience focus groups. Several methods were used to reach the wide target audience. The design and development stage included digital prototyping in parallel with Bayesian model development. Feedback was sought from multiple workshops, audience focus groups, and regular meetings throughout with an expert external advisory group.</p><p><strong>Results: </strong>The initial scoping identified 4 target audience groups: the general public, researchers, health practitioners, and policy makers. These target groups were consulted throughout the project to ensure the developed model and uncertainty visualizations were effective for communication. In this paper, we detail ACA features and design iterations, including the 3 complementary ways in which uncertainty is communicated: the wave plot, the v-plot, and color transparency.</p><p><strong>Discussion: </strong>We reflect on the methods, design iterations, decision-making process, and document lessons learned for future atlases.</p><p><strong>Conclusion: </strong>The ACA has been hugely successful since launching in 2018. It has received over 62 000 individual users from over 100 countries and across all target audiences. It has been replicated in other countries and the second version of the ACA was launched in May 2024. This paper provides rich documentation for future projects.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141972204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siwei Zhang, Nick Strayer, Tess Vessels, Karmel Choi, Geoffrey W Wang, Yajing Li, Cosmin A Bejan, Ryan S Hsi, Alexander G Bick, Digna R Velez Edwards, Michael R Savona, Elizabeth J Phillips, Jill M Pulley, Wesley H Self, Wilkins Consuelo Hopkins, Dan M Roden, Jordan W Smoller, Douglas M Ruderfer, Yaomin Xu
Objectives: To address the need for interactive visualization tools and databases in characterizing multimorbidity patterns across different populations, we developed the Phenome-wide Multi-Institutional Multimorbidity Explorer (PheMIME). This tool leverages three large-scale EHR systems to facilitate efficient analysis and visualization of disease multimorbidity, aiming to reveal both robust and novel disease associations that are consistent across different systems and to provide insight for enhancing personalized healthcare strategies.
Materials and methods: PheMIME integrates summary statistics from phenome-wide analyses of disease multimorbidities, utilizing data from Vanderbilt University Medical Center, Mass General Brigham, and the UK Biobank. It offers interactive and multifaceted visualizations for exploring multimorbidity. Incorporating an enhanced version of associationSubgraphs, PheMIME also enables dynamic analysis and inference of disease clusters, promoting the discovery of complex multimorbidity patterns. A case study on schizophrenia demonstrates its capability for generating interactive visualizations of multimorbidity networks within and across multiple systems. Additionally, PheMIME supports diverse multimorbidity-based discoveries, detailed further in online case studies.
Results: The PheMIME is accessible at https://prod.tbilab.org/PheMIME/. A comprehensive tutorial and multiple case studies for demonstration are available at https://prod.tbilab.org/PheMIME_supplementary_materials/. The source code can be downloaded from https://github.com/tbilab/PheMIME.
Discussion: PheMIME represents a significant advancement in medical informatics, offering an efficient solution for accessing, analyzing, and interpreting the complex and noisy real-world patient data in electronic health records.
Conclusion: PheMIME provides an extensive multimorbidity knowledge base that consolidates data from three EHR systems, and it is a novel interactive tool designed to analyze and visualize multimorbidities across multiple EHR datasets. It stands out as the first of its kind to offer extensive multimorbidity knowledge integration with substantial support for efficient online analysis and interactive visualization.
{"title":"PheMIME: an interactive web app and knowledge base for phenome-wide, multi-institutional multimorbidity analysis.","authors":"Siwei Zhang, Nick Strayer, Tess Vessels, Karmel Choi, Geoffrey W Wang, Yajing Li, Cosmin A Bejan, Ryan S Hsi, Alexander G Bick, Digna R Velez Edwards, Michael R Savona, Elizabeth J Phillips, Jill M Pulley, Wesley H Self, Wilkins Consuelo Hopkins, Dan M Roden, Jordan W Smoller, Douglas M Ruderfer, Yaomin Xu","doi":"10.1093/jamia/ocae182","DOIUrl":"10.1093/jamia/ocae182","url":null,"abstract":"<p><strong>Objectives: </strong>To address the need for interactive visualization tools and databases in characterizing multimorbidity patterns across different populations, we developed the Phenome-wide Multi-Institutional Multimorbidity Explorer (PheMIME). This tool leverages three large-scale EHR systems to facilitate efficient analysis and visualization of disease multimorbidity, aiming to reveal both robust and novel disease associations that are consistent across different systems and to provide insight for enhancing personalized healthcare strategies.</p><p><strong>Materials and methods: </strong>PheMIME integrates summary statistics from phenome-wide analyses of disease multimorbidities, utilizing data from Vanderbilt University Medical Center, Mass General Brigham, and the UK Biobank. It offers interactive and multifaceted visualizations for exploring multimorbidity. Incorporating an enhanced version of associationSubgraphs, PheMIME also enables dynamic analysis and inference of disease clusters, promoting the discovery of complex multimorbidity patterns. A case study on schizophrenia demonstrates its capability for generating interactive visualizations of multimorbidity networks within and across multiple systems. Additionally, PheMIME supports diverse multimorbidity-based discoveries, detailed further in online case studies.</p><p><strong>Results: </strong>The PheMIME is accessible at https://prod.tbilab.org/PheMIME/. A comprehensive tutorial and multiple case studies for demonstration are available at https://prod.tbilab.org/PheMIME_supplementary_materials/. The source code can be downloaded from https://github.com/tbilab/PheMIME.</p><p><strong>Discussion: </strong>PheMIME represents a significant advancement in medical informatics, offering an efficient solution for accessing, analyzing, and interpreting the complex and noisy real-world patient data in electronic health records.</p><p><strong>Conclusion: </strong>PheMIME provides an extensive multimorbidity knowledge base that consolidates data from three EHR systems, and it is a novel interactive tool designed to analyze and visualize multimorbidities across multiple EHR datasets. It stands out as the first of its kind to offer extensive multimorbidity knowledge integration with substantial support for efficient online analysis and interactive visualization.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141914432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuting Guo, Anthony Ovadje, Mohammed Ali Al-Garadi, Abeed Sarker
Objectives: Large language models (LLMs) have demonstrated remarkable success in natural language processing (NLP) tasks. This study aimed to evaluate their performances on social media-based health-related text classification tasks.
Materials and methods: We benchmarked 1 Support Vector Machine (SVM), 3 supervised pretrained language models (PLMs), and 2 LLMs-based classifiers across 6 text classification tasks. We developed 3 approaches for leveraging LLMs: employing LLMs as zero-shot classifiers, using LLMs as data annotators, and utilizing LLMs with few-shot examples for data augmentation.
Results: Across all tasks, the mean (SD) F1 score differences for RoBERTa, BERTweet, and SocBERT trained on human-annotated data were 0.24 (±0.10), 0.25 (±0.11), and 0.23 (±0.11), respectively, compared to those trained on the data annotated using GPT3.5, and were 0.16 (±0.07), 0.16 (±0.08), and 0.14 (±0.08) using GPT4, respectively. The GPT3.5 and GPT4 zero-shot classifiers outperformed SVMs in a single task and in 5 out of 6 tasks, respectively. When leveraging LLMs for data augmentation, the RoBERTa models trained on GPT4-augmented data demonstrated superior or comparable performance compared to those trained on human-annotated data alone.
Discussion: The results revealed that using LLM-annotated data only for training supervised classification models was ineffective. However, employing the LLM as a zero-shot classifier exhibited the potential to outperform traditional SVM models and achieved a higher recall than the advanced transformer-based model RoBERTa. Additionally, our results indicated that utilizing GPT3.5 for data augmentation could potentially harm model performance. In contrast, data augmentation with GPT4 demonstrated improved model performances, showcasing the potential of LLMs in reducing the need for extensive training data.
Conclusions: By leveraging the data augmentation strategy, we can harness the power of LLMs to develop smaller, more effective domain-specific NLP models. Using LLM-annotated data without human guidance for training lightweight supervised classification models is an ineffective strategy. However, LLM, as a zero-shot classifier, shows promise in excluding false negatives and potentially reducing the human effort required for data annotation.
{"title":"Evaluating large language models for health-related text classification tasks with public social media data.","authors":"Yuting Guo, Anthony Ovadje, Mohammed Ali Al-Garadi, Abeed Sarker","doi":"10.1093/jamia/ocae210","DOIUrl":"https://doi.org/10.1093/jamia/ocae210","url":null,"abstract":"<p><strong>Objectives: </strong>Large language models (LLMs) have demonstrated remarkable success in natural language processing (NLP) tasks. This study aimed to evaluate their performances on social media-based health-related text classification tasks.</p><p><strong>Materials and methods: </strong>We benchmarked 1 Support Vector Machine (SVM), 3 supervised pretrained language models (PLMs), and 2 LLMs-based classifiers across 6 text classification tasks. We developed 3 approaches for leveraging LLMs: employing LLMs as zero-shot classifiers, using LLMs as data annotators, and utilizing LLMs with few-shot examples for data augmentation.</p><p><strong>Results: </strong>Across all tasks, the mean (SD) F1 score differences for RoBERTa, BERTweet, and SocBERT trained on human-annotated data were 0.24 (±0.10), 0.25 (±0.11), and 0.23 (±0.11), respectively, compared to those trained on the data annotated using GPT3.5, and were 0.16 (±0.07), 0.16 (±0.08), and 0.14 (±0.08) using GPT4, respectively. The GPT3.5 and GPT4 zero-shot classifiers outperformed SVMs in a single task and in 5 out of 6 tasks, respectively. When leveraging LLMs for data augmentation, the RoBERTa models trained on GPT4-augmented data demonstrated superior or comparable performance compared to those trained on human-annotated data alone.</p><p><strong>Discussion: </strong>The results revealed that using LLM-annotated data only for training supervised classification models was ineffective. However, employing the LLM as a zero-shot classifier exhibited the potential to outperform traditional SVM models and achieved a higher recall than the advanced transformer-based model RoBERTa. Additionally, our results indicated that utilizing GPT3.5 for data augmentation could potentially harm model performance. In contrast, data augmentation with GPT4 demonstrated improved model performances, showcasing the potential of LLMs in reducing the need for extensive training data.</p><p><strong>Conclusions: </strong>By leveraging the data augmentation strategy, we can harness the power of LLMs to develop smaller, more effective domain-specific NLP models. Using LLM-annotated data without human guidance for training lightweight supervised classification models is an ineffective strategy. However, LLM, as a zero-shot classifier, shows promise in excluding false negatives and potentially reducing the human effort required for data annotation.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran
Objectives: To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.
Materials and methods: SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.
Results and discussion: 91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.
Conclusion: Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.
{"title":"Towards cross-application model-agnostic federated cohort discovery.","authors":"Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran","doi":"10.1093/jamia/ocae211","DOIUrl":"10.1093/jamia/ocae211","url":null,"abstract":"<p><strong>Objectives: </strong>To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.</p><p><strong>Materials and methods: </strong>SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.</p><p><strong>Results and discussion: </strong>91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.</p><p><strong>Conclusion: </strong>Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carla McGruder, Kelly Tangney, Deanna Erwin, Jake Plewa, Karyn Onyeneho, Rhonda Moore, Anastasia Wise, Scott Topper, Alicia Y Zhou
Objective: This article outlines a scalable system developed by the All of Us Research Program's Genetic Counseling Resource to vet a large database of healthcare resources for supporting participants with health-related DNA results.
Materials and methods: After a literature review of established evaluation frameworks for health resources, we created SONAR, a 10-item framework and grading scale for health-related participant-facing resources. SONAR was used to review clinical resources that could be shared with participants during genetic counseling.
Results: Application of SONAR shortened resource approval time from 7 days to 1 day. About 256 resources were approved and 8 rejected through SONAR review. Most approved resources were relevant to participants nationwide (60.0%). The most common resource types were related to support groups (20%), cancer care (30.6%), and general educational resources (12.4%). All of Us genetic counselors provided 1161 approved resources during 3005 (38.6%) consults, mainly to local genetic counselors (29.9%), support groups (21.9%), and educational resources (21.0%).
Discussion: SONAR's systematic method simplifies resource vetting for healthcare providers, easing the burden of identifying and evaluating credible resources. Compiling these resources into a user-friendly database allows providers to share these resources efficiently, better equipping participants to complete follow up actions from health-related DNA results.
Conclusion: The All of Us Genetic Counseling Resource connects participants receiving health-related DNA results with relevant follow-up resources on a high-volume, national level. This has been made possible by the creation of a novel resource database and validation system.
{"title":"Sounding out solutions: using SONAR to connect participants with relevant healthcare resources.","authors":"Carla McGruder, Kelly Tangney, Deanna Erwin, Jake Plewa, Karyn Onyeneho, Rhonda Moore, Anastasia Wise, Scott Topper, Alicia Y Zhou","doi":"10.1093/jamia/ocae200","DOIUrl":"https://doi.org/10.1093/jamia/ocae200","url":null,"abstract":"<p><strong>Objective: </strong>This article outlines a scalable system developed by the All of Us Research Program's Genetic Counseling Resource to vet a large database of healthcare resources for supporting participants with health-related DNA results.</p><p><strong>Materials and methods: </strong>After a literature review of established evaluation frameworks for health resources, we created SONAR, a 10-item framework and grading scale for health-related participant-facing resources. SONAR was used to review clinical resources that could be shared with participants during genetic counseling.</p><p><strong>Results: </strong>Application of SONAR shortened resource approval time from 7 days to 1 day. About 256 resources were approved and 8 rejected through SONAR review. Most approved resources were relevant to participants nationwide (60.0%). The most common resource types were related to support groups (20%), cancer care (30.6%), and general educational resources (12.4%). All of Us genetic counselors provided 1161 approved resources during 3005 (38.6%) consults, mainly to local genetic counselors (29.9%), support groups (21.9%), and educational resources (21.0%).</p><p><strong>Discussion: </strong>SONAR's systematic method simplifies resource vetting for healthcare providers, easing the burden of identifying and evaluating credible resources. Compiling these resources into a user-friendly database allows providers to share these resources efficiently, better equipping participants to complete follow up actions from health-related DNA results.</p><p><strong>Conclusion: </strong>The All of Us Genetic Counseling Resource connects participants receiving health-related DNA results with relevant follow-up resources on a high-volume, national level. This has been made possible by the creation of a novel resource database and validation system.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141879672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Cella, Maja Kuharic, John Devin Peipert, Katy Bedjeti, Sofia F Garcia, Betina Yanez, Lisa R Hirschhorn, Ava Coughlin, Victoria Morken, Mary O'Connor, Jeffrey A Linder, Neil Jordan, Ronald T Ackermann, Saki Amagai, Sheetal Kircher, Nisha Mohindra, Vikram Aggarwal, Melissa Weitzel, Eugene C Nelson, Glyn Elwyn, Aricca D Van Citters, Cynthia Barnard
Objectives: To assess the use of a co-designed patient-reported outcome (PRO) clinical dashboard and estimate its impact on shared decision-making (SDM) and symptomatology in adults with advanced cancer or chronic kidney disease (CKD).
Materials and methods: We developed a clinical PRO dashboard within the Northwestern Medicine Patient-Reported Outcomes system, enhanced through co-design involving 20 diverse constituents. Using a single-group, pretest-posttest design, we evaluated the dashboard's use among patients with advanced cancer or CKD between June 2020 and January 2022. Eligible patients had a visit with a participating clinician, completed at least two dashboard-eligible visits, and consented to follow-up surveys. PROs were collected 72 h prior to visits, including measures for chronic condition management self-efficacy, health-related quality of life (PROMIS measures), and SDM (collaboRATE). Responses were integrated into the EHR dashboard and accessible to clinicians and patients.
Results: We recruited 157 participants: 66 with advanced cancer and 91 with CKD. There were significant improvements in SDM from baseline, as assessed by collaboRATE scores. The proportion of participants reporting the highest level of SDM on every collaboRATE item increased by 15 percentage points from baseline to 3 months, and 17 points between baseline and 6-month follow-up. Additionally, there was a clinically meaningful decrease in anxiety levels over study period (T-score baseline: 53; 3-month: 52; 6-month: 50; P < .001), with a standardized response mean (SRM) of -0.38 at 6 months.
Discussion: PRO clinical dashboards, developed and shared with patients, may enhance SDM and reduce anxiety among patients with advanced cancer and CKD.
{"title":"Shared decision-making and disease management in advanced cancer and chronic kidney disease using patient-reported outcome dashboards.","authors":"David Cella, Maja Kuharic, John Devin Peipert, Katy Bedjeti, Sofia F Garcia, Betina Yanez, Lisa R Hirschhorn, Ava Coughlin, Victoria Morken, Mary O'Connor, Jeffrey A Linder, Neil Jordan, Ronald T Ackermann, Saki Amagai, Sheetal Kircher, Nisha Mohindra, Vikram Aggarwal, Melissa Weitzel, Eugene C Nelson, Glyn Elwyn, Aricca D Van Citters, Cynthia Barnard","doi":"10.1093/jamia/ocae180","DOIUrl":"https://doi.org/10.1093/jamia/ocae180","url":null,"abstract":"<p><strong>Objectives: </strong>To assess the use of a co-designed patient-reported outcome (PRO) clinical dashboard and estimate its impact on shared decision-making (SDM) and symptomatology in adults with advanced cancer or chronic kidney disease (CKD).</p><p><strong>Materials and methods: </strong>We developed a clinical PRO dashboard within the Northwestern Medicine Patient-Reported Outcomes system, enhanced through co-design involving 20 diverse constituents. Using a single-group, pretest-posttest design, we evaluated the dashboard's use among patients with advanced cancer or CKD between June 2020 and January 2022. Eligible patients had a visit with a participating clinician, completed at least two dashboard-eligible visits, and consented to follow-up surveys. PROs were collected 72 h prior to visits, including measures for chronic condition management self-efficacy, health-related quality of life (PROMIS measures), and SDM (collaboRATE). Responses were integrated into the EHR dashboard and accessible to clinicians and patients.</p><p><strong>Results: </strong>We recruited 157 participants: 66 with advanced cancer and 91 with CKD. There were significant improvements in SDM from baseline, as assessed by collaboRATE scores. The proportion of participants reporting the highest level of SDM on every collaboRATE item increased by 15 percentage points from baseline to 3 months, and 17 points between baseline and 6-month follow-up. Additionally, there was a clinically meaningful decrease in anxiety levels over study period (T-score baseline: 53; 3-month: 52; 6-month: 50; P < .001), with a standardized response mean (SRM) of -0.38 at 6 months.</p><p><strong>Discussion: </strong>PRO clinical dashboards, developed and shared with patients, may enhance SDM and reduce anxiety among patients with advanced cancer and CKD.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":null,"pages":null},"PeriodicalIF":4.7,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141879671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}