{"title":"Large language models in biomedicine and health: current research landscape and future directions.","authors":"Zhiyong Lu, Yifan Peng, Trevor Cohen, Marzyeh Ghassemi, Chunhua Weng, Shubo Tian","doi":"10.1093/jamia/ocae202","DOIUrl":"10.1093/jamia/ocae202","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":"31 9","pages":"1801-1811"},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Yan, Henry H Ong, Monika E Grabowska, Matthew S Krantz, Wu-Chen Su, Alyson L Dickson, Josh F Peterson, QiPing Feng, Dan M Roden, C Michael Stein, V Eric Kerchberger, Bradley A Malin, Wei-Qi Wei
Objectives: Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts.
Materials and methods: We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (ie, type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network.
Results: GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values).
Conclusion: GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.
{"title":"Large language models facilitate the generation of electronic health record phenotyping algorithms.","authors":"Chao Yan, Henry H Ong, Monika E Grabowska, Matthew S Krantz, Wu-Chen Su, Alyson L Dickson, Josh F Peterson, QiPing Feng, Dan M Roden, C Michael Stein, V Eric Kerchberger, Bradley A Malin, Wei-Qi Wei","doi":"10.1093/jamia/ocae072","DOIUrl":"10.1093/jamia/ocae072","url":null,"abstract":"<p><strong>Objectives: </strong>Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts.</p><p><strong>Materials and methods: </strong>We prompted four LLMs-GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard-in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (ie, type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network.</p><p><strong>Results: </strong>GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values).</p><p><strong>Conclusion: </strong>GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1994-2001"},"PeriodicalIF":4.7,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339509/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140873366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Theresa A Koleck, Caitlin Dreisbach, Chen Zhang, Susan Grayson, Maichou Lor, Zhirui Deng, Alex Conway, Peter D R Higgins, Suzanne Bakken
Objectives: Integration of social determinants of health into health outcomes research will allow researchers to study health inequities. The All of Us Research Program has the potential to be a rich source of social determinants of health data. However, user-friendly recommendations for scoring and interpreting the All of Us Social Determinants of Health Survey are needed to return value to communities through advancing researcher competencies in use of the All of Us Research Hub Researcher Workbench. We created a user guide aimed at providing researchers with an overview of the Social Determinants of Health Survey, recommendations for scoring and interpreting participant responses, and readily executable R and Python functions.
Target audience: This user guide targets registered users of the All of Us Research Hub Researcher Workbench, a cloud-based platform that supports analysis of All of Us data, who are currently conducting or planning to conduct analyses using the Social Determinants of Health Survey.
Scope: We introduce 14 constructs evaluated as part of the Social Determinants of Health Survey and summarize construct operationalization. We offer 30 literature-informed recommendations for scoring participant responses and interpreting scores, with multiple options available for 8 of the constructs. Then, we walk through example R and Python functions for relabeling responses and scoring constructs that can be directly implemented in Jupyter Notebook or RStudio within the Researcher Workbench. Full source code is available in supplemental files and GitHub. Finally, we discuss psychometric considerations related to the Social Determinants of Health Survey for researchers.
目标:将健康的社会决定因素纳入健康结果研究将使研究人员能够研究健康不平等问题。我们所有人研究计划有可能成为丰富的健康社会决定因素数据来源。然而,我们需要用户友好型的建议来对 "我们所有人的社会决定因素健康调查 "进行评分和解释,以便通过提高研究人员使用 "我们所有人的研究中心 "研究人员工作台的能力来为社区创造价值。我们创建了一份用户指南,旨在为研究人员提供健康状况社会决定因素调查的概述、对参与者回复进行评分和解释的建议,以及易于执行的 R 和 Python 函数:本用户指南的目标受众是 "我们所有人 "研究中心(All of Us Research Hub)研究人员工作台(Researcher Workbench)的注册用户,该工作台是一个支持 "我们所有人 "数据分析的云平台,目前正在使用或计划使用健康社会决定因素调查进行分析:我们介绍了作为健康社会决定因素调查一部分而评估的 14 个构造,并总结了构造的可操作性。我们提供了 30 项参考文献的建议,用于对参与者的回答进行评分和解释分数,其中 8 个构像有多个选项。然后,我们将通过 R 和 Python 函数示例来重新标注回答和结构式评分,这些函数可直接在研究者工作台的 Jupyter Notebook 或 RStudio 中实现。完整的源代码可在补充文件和 GitHub 中获取。最后,我们将讨论与研究人员健康社会决定因素调查相关的心理测量注意事项。
{"title":"User guide for Social Determinants of Health Survey data in the All of Us Research Program.","authors":"Theresa A Koleck, Caitlin Dreisbach, Chen Zhang, Susan Grayson, Maichou Lor, Zhirui Deng, Alex Conway, Peter D R Higgins, Suzanne Bakken","doi":"10.1093/jamia/ocae214","DOIUrl":"10.1093/jamia/ocae214","url":null,"abstract":"<p><strong>Objectives: </strong>Integration of social determinants of health into health outcomes research will allow researchers to study health inequities. The All of Us Research Program has the potential to be a rich source of social determinants of health data. However, user-friendly recommendations for scoring and interpreting the All of Us Social Determinants of Health Survey are needed to return value to communities through advancing researcher competencies in use of the All of Us Research Hub Researcher Workbench. We created a user guide aimed at providing researchers with an overview of the Social Determinants of Health Survey, recommendations for scoring and interpreting participant responses, and readily executable R and Python functions.</p><p><strong>Target audience: </strong>This user guide targets registered users of the All of Us Research Hub Researcher Workbench, a cloud-based platform that supports analysis of All of Us data, who are currently conducting or planning to conduct analyses using the Social Determinants of Health Survey.</p><p><strong>Scope: </strong>We introduce 14 constructs evaluated as part of the Social Determinants of Health Survey and summarize construct operationalization. We offer 30 literature-informed recommendations for scoring participant responses and interpreting scores, with multiple options available for 8 of the constructs. Then, we walk through example R and Python functions for relabeling responses and scoring constructs that can be directly implemented in Jupyter Notebook or RStudio within the Researcher Workbench. Full source code is available in supplemental files and GitHub. Finally, we discuss psychometric considerations related to the Social Determinants of Health Survey for researchers.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142082352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mona Alshahawey, Eissa Jafari, Steven M Smith, Caitrin W McDonough
Background: Hypertension (HTN) remains a significant public health concern and the primary modifiable risk factor for cardiovascular disease, which is the leading cause of death in the United States. We applied our validated HTN computable phenotypes within the All of Us Research Program to uncover prevalence and characteristics of HTN and apparent treatment-resistant hypertension (aTRH) in United States.
Methods: Within the All of Us Researcher Workbench, we built a retrospective cohort (January 1, 2008-July 1, 2023), identifying all adults with available age data, at least one blood pressure (BP) measurement, prescribed at least one antihypertensive medication, and with at least one SNOMED "Essential hypertension" diagnosis code.
Results: We identified 99 461 participants with HTN who met the eligibility criteria. Following the application of our computable phenotypes, an overall population of 81 462 were further categorized to aTRH (14.4%), stable-controlled HTN (SCH) (39.5%), and Other HTN (46.1%). Compared to participants with SCH, participants with aTRH were older, more likely to be of Black or African American race, had higher levels of social deprivation, and a heightened prevalence of comorbidities such as hyperlipidemia and diabetes. Heart failure, chronic kidney disease, and diabetes were the comorbidities most strongly associated with aTRH. β-blockers were the most prescribed antihypertensive medication. At index date, the overall BP control rate was 62%.
Discussion and conclusion: All of Us provides a unique opportunity to characterize HTN in the United States. Consistent findings from this study with our prior research highlight the interoperability of our computable phenotypes.
{"title":"Characterizing apparent treatment resistant hypertension in the United States: insights from the All of Us Research Program.","authors":"Mona Alshahawey, Eissa Jafari, Steven M Smith, Caitrin W McDonough","doi":"10.1093/jamia/ocae227","DOIUrl":"https://doi.org/10.1093/jamia/ocae227","url":null,"abstract":"<p><strong>Background: </strong>Hypertension (HTN) remains a significant public health concern and the primary modifiable risk factor for cardiovascular disease, which is the leading cause of death in the United States. We applied our validated HTN computable phenotypes within the All of Us Research Program to uncover prevalence and characteristics of HTN and apparent treatment-resistant hypertension (aTRH) in United States.</p><p><strong>Methods: </strong>Within the All of Us Researcher Workbench, we built a retrospective cohort (January 1, 2008-July 1, 2023), identifying all adults with available age data, at least one blood pressure (BP) measurement, prescribed at least one antihypertensive medication, and with at least one SNOMED \"Essential hypertension\" diagnosis code.</p><p><strong>Results: </strong>We identified 99 461 participants with HTN who met the eligibility criteria. Following the application of our computable phenotypes, an overall population of 81 462 were further categorized to aTRH (14.4%), stable-controlled HTN (SCH) (39.5%), and Other HTN (46.1%). Compared to participants with SCH, participants with aTRH were older, more likely to be of Black or African American race, had higher levels of social deprivation, and a heightened prevalence of comorbidities such as hyperlipidemia and diabetes. Heart failure, chronic kidney disease, and diabetes were the comorbidities most strongly associated with aTRH. β-blockers were the most prescribed antihypertensive medication. At index date, the overall BP control rate was 62%.</p><p><strong>Discussion and conclusion: </strong>All of Us provides a unique opportunity to characterize HTN in the United States. Consistent findings from this study with our prior research highlight the interoperability of our computable phenotypes.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142057034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Objectives: Research participants value learning how their data contributions are advancing health research (ie, data stories). The All of Us Research Program gathered insights from program staff to learn what research topics they think are of interest to participants, what support staff need to communicate data stories, and how staff use data story dissemination tools.
Materials and methods: Using an online 25-item assessment, we collected information from All of Us staff at 7 Federally Qualified Health Centers.
Results: Topics of greatest interest or relevance included income insecurity (83%), diabetes (78%), and mental health (78%). Respondents prioritized in-person outreach in the community (70%) as a preferred setting to share data stories. Familiarity with available dissemination tools varied.
Discussion: Responses support prioritizing materials for in-person outreach and training staff how to use dissemination tools.
Conclusion: The findings will inform All of Us communication strategy, content, materials, and staff training resources to effectively deliver data stories as return of value to participants.
{"title":"Communicating research findings as a return of value to All of Us Research Program participants: insights from staff at Federally Qualified Health Centers.","authors":"Kathryn P Smith, Jenn Holmes, Jennifer Shelley","doi":"10.1093/jamia/ocae207","DOIUrl":"https://doi.org/10.1093/jamia/ocae207","url":null,"abstract":"<p><strong>Objectives: </strong>Research participants value learning how their data contributions are advancing health research (ie, data stories). The All of Us Research Program gathered insights from program staff to learn what research topics they think are of interest to participants, what support staff need to communicate data stories, and how staff use data story dissemination tools.</p><p><strong>Materials and methods: </strong>Using an online 25-item assessment, we collected information from All of Us staff at 7 Federally Qualified Health Centers.</p><p><strong>Results: </strong>Topics of greatest interest or relevance included income insecurity (83%), diabetes (78%), and mental health (78%). Respondents prioritized in-person outreach in the community (70%) as a preferred setting to share data stories. Familiarity with available dissemination tools varied.</p><p><strong>Discussion: </strong>Responses support prioritizing materials for in-person outreach and training staff how to use dissemination tools.</p><p><strong>Conclusion: </strong>The findings will inform All of Us communication strategy, content, materials, and staff training resources to effectively deliver data stories as return of value to participants.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sukanya Mohapatra, Mirna Issa, Vedrana Ivezic, Rose Doherty, Stephanie Marks, Esther Lan, Shawn Chen, Keith Rozett, Lauren Cullen, Wren Reynolds, Rose Rocchio, Gregg C Fonarow, Michael K Ong, William F Speier, Corey W Arnold
Objectives: Mobile health (mHealth) regimens can improve health through the continuous monitoring of biometric parameters paired with appropriate interventions. However, adherence to monitoring tends to decay over time. Our randomized controlled trial sought to determine: (1) if a mobile app with gamification and financial incentives significantly increases adherence to mHealth monitoring in a population of heart failure patients; and (2) if activity data correlate with disease-specific symptoms.
Materials and methods: We recruited individuals with heart failure into a prospective 180-day monitoring study with 3 arms. All 3 arms included monitoring with a connected weight scale and an activity tracker. The second arm included an additional mobile app with gamification, and the third arm included the mobile app and a financial incentive awarded based on adherence to mobile monitoring.
Results: We recruited 111 heart failure patients into the study. We found that the arm including the financial incentive led to significantly higher adherence to activity tracker (95% vs 72.2%, P = .01) and weight (87.5% vs 69.4%, P = .002) monitoring compared to the arm that included the monitoring devices alone. Furthermore, we found a significant correlation between daily steps and daily symptom severity.
Discussion and conclusion: Our findings indicate that mobile apps with added engagement features can be useful tools for improving adherence over time and may thus increase the impact of mHealth-driven interventions. Additionally, activity tracker data can provide passive monitoring of disease burden that may be used to predict future events.
{"title":"Increasing adherence and collecting symptom-specific biometric signals in remote monitoring of heart failure patients: a randomized controlled trial.","authors":"Sukanya Mohapatra, Mirna Issa, Vedrana Ivezic, Rose Doherty, Stephanie Marks, Esther Lan, Shawn Chen, Keith Rozett, Lauren Cullen, Wren Reynolds, Rose Rocchio, Gregg C Fonarow, Michael K Ong, William F Speier, Corey W Arnold","doi":"10.1093/jamia/ocae221","DOIUrl":"https://doi.org/10.1093/jamia/ocae221","url":null,"abstract":"<p><strong>Objectives: </strong>Mobile health (mHealth) regimens can improve health through the continuous monitoring of biometric parameters paired with appropriate interventions. However, adherence to monitoring tends to decay over time. Our randomized controlled trial sought to determine: (1) if a mobile app with gamification and financial incentives significantly increases adherence to mHealth monitoring in a population of heart failure patients; and (2) if activity data correlate with disease-specific symptoms.</p><p><strong>Materials and methods: </strong>We recruited individuals with heart failure into a prospective 180-day monitoring study with 3 arms. All 3 arms included monitoring with a connected weight scale and an activity tracker. The second arm included an additional mobile app with gamification, and the third arm included the mobile app and a financial incentive awarded based on adherence to mobile monitoring.</p><p><strong>Results: </strong>We recruited 111 heart failure patients into the study. We found that the arm including the financial incentive led to significantly higher adherence to activity tracker (95% vs 72.2%, P = .01) and weight (87.5% vs 69.4%, P = .002) monitoring compared to the arm that included the monitoring devices alone. Furthermore, we found a significant correlation between daily steps and daily symptom severity.</p><p><strong>Discussion and conclusion: </strong>Our findings indicate that mobile apps with added engagement features can be useful tools for improving adherence over time and may thus increase the impact of mHealth-driven interventions. Additionally, activity tracker data can provide passive monitoring of disease burden that may be used to predict future events.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Northrop, Anika Christofferson, Saumya Umashankar, Michelle Melisko, Paolo Castillo, Thelma Brown, Diane Heditsian, Susie Brain, Carol Simmons, Tina Hieken, Kathryn J Ruddy, Candace Mainor, Anosheh Afghahi, Sarah Tevis, Anne Blaes, Irene Kang, Adam Asare, Laura Esserman, Dawn L Hershman, Amrita Basu
Objectives: We describe the development and implementation of a system for monitoring patient-reported adverse events and quality of life using electronic Patient Reported Outcome (ePRO) instruments in the I-SPY2 Trial, a phase II clinical trial for locally advanced breast cancer. We describe the administration of technological, workflow, and behavior change interventions and their associated impact on questionnaire completion.
Materials and methods: Using the OpenClinica electronic data capture system, we developed rules-based logic to build automated ePRO surveys, customized to the I-SPY2 treatment schedule. We piloted ePROs at the University of California, San Francisco (UCSF) to optimize workflow in the context of trial treatment scenarios and staggered rollout of the ePRO system to 26 sites to ensure effective implementation of the technology.
Results: Increasing ePRO completion requires workflow solutions and research staff engagement. Over two years, we increased baseline survey completion from 25% to 80%. The majority of patients completed between 30% and 75% of the questionnaires they received, with no statistically significant variation in survey completion by age, race or ethnicity. Patients who completed the screening timepoint questionnaire were significantly more likely to complete more of the surveys they received at later timepoints (mean completion of 74.1% vs 35.5%, P < .0001). Baseline PROMIS social functioning and grade 2 or more PRO-CTCAE interference of Abdominal Pain, Decreased Appetite, Dizziness and Shortness of Breath was associated with lower survey completion rates.
Discussion and conclusion: By implementing ePROs, we have the potential to increase efficiency and accuracy of patient-reported clinical trial data collection, while improving quality of care, patient safety, and health outcomes. Our method is accessible across demographics and facilitates an ease of data collection and sharing across nationwide sites. We identify predictors of decreased completion that can optimize resource allocation by better targeting efforts such as in-person outreach, staff engagement, a robust technical workflow, and increased monitoring to improve overall completion rates.
{"title":"Implementation and impact of an electronic patient reported outcomes system in a phase II multi-site adaptive platform clinical trial for early-stage breast cancer.","authors":"Anna Northrop, Anika Christofferson, Saumya Umashankar, Michelle Melisko, Paolo Castillo, Thelma Brown, Diane Heditsian, Susie Brain, Carol Simmons, Tina Hieken, Kathryn J Ruddy, Candace Mainor, Anosheh Afghahi, Sarah Tevis, Anne Blaes, Irene Kang, Adam Asare, Laura Esserman, Dawn L Hershman, Amrita Basu","doi":"10.1093/jamia/ocae190","DOIUrl":"https://doi.org/10.1093/jamia/ocae190","url":null,"abstract":"<p><strong>Objectives: </strong>We describe the development and implementation of a system for monitoring patient-reported adverse events and quality of life using electronic Patient Reported Outcome (ePRO) instruments in the I-SPY2 Trial, a phase II clinical trial for locally advanced breast cancer. We describe the administration of technological, workflow, and behavior change interventions and their associated impact on questionnaire completion.</p><p><strong>Materials and methods: </strong>Using the OpenClinica electronic data capture system, we developed rules-based logic to build automated ePRO surveys, customized to the I-SPY2 treatment schedule. We piloted ePROs at the University of California, San Francisco (UCSF) to optimize workflow in the context of trial treatment scenarios and staggered rollout of the ePRO system to 26 sites to ensure effective implementation of the technology.</p><p><strong>Results: </strong>Increasing ePRO completion requires workflow solutions and research staff engagement. Over two years, we increased baseline survey completion from 25% to 80%. The majority of patients completed between 30% and 75% of the questionnaires they received, with no statistically significant variation in survey completion by age, race or ethnicity. Patients who completed the screening timepoint questionnaire were significantly more likely to complete more of the surveys they received at later timepoints (mean completion of 74.1% vs 35.5%, P < .0001). Baseline PROMIS social functioning and grade 2 or more PRO-CTCAE interference of Abdominal Pain, Decreased Appetite, Dizziness and Shortness of Breath was associated with lower survey completion rates.</p><p><strong>Discussion and conclusion: </strong>By implementing ePROs, we have the potential to increase efficiency and accuracy of patient-reported clinical trial data collection, while improving quality of care, patient safety, and health outcomes. Our method is accessible across demographics and facilitates an ease of data collection and sharing across nationwide sites. We identify predictors of decreased completion that can optimize resource allocation by better targeting efforts such as in-person outreach, staff engagement, a robust technical workflow, and increased monitoring to improve overall completion rates.</p><p><strong>Trial registration: </strong>https://clinicaltrials.gov/study/NCT01042379.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142001162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew Guide, Shawn Garbett, Xiaoke Feng, Brandy M Mapes, Justin Cook, Lina Sulieman, Robert M Cronin, Qingxia Chen
Importance: Scales often arise from multi-item questionnaires, yet commonly face item non-response. Traditional solutions use weighted mean (WMean) from available responses, but potentially overlook missing data intricacies. Advanced methods like multiple imputation (MI) address broader missing data, but demand increased computational resources. Researchers frequently use survey data in the All of Us Research Program (All of Us), and it is imperative to determine if the increased computational burden of employing MI to handle non-response is justifiable.
Objectives: Using the 5-item Physical Activity Neighborhood Environment Scale (PANES) in All of Us, this study assessed the tradeoff between efficacy and computational demands of WMean, MI, and inverse probability weighting (IPW) when dealing with item non-response.
Materials and methods: Synthetic missingness, allowing 1 or more item non-response, was introduced into PANES across 3 missing mechanisms and various missing percentages (10%-50%). Each scenario compared WMean of complete questions, MI, and IPW on bias, variability, coverage probability, and computation time.
Results: All methods showed minimal biases (all <5.5%) for good internal consistency, with WMean suffered most with poor consistency. IPW showed considerable variability with increasing missing percentage. MI required significantly more computational resources, taking >8000 and >100 times longer than WMean and IPW in full data analysis, respectively.
Discussion and conclusion: The marginal performance advantages of MI for item non-response in highly reliable scales do not warrant its escalated cloud computational burden in All of Us, particularly when coupled with computationally demanding post-imputation analyses. Researchers using survey scales with low missingness could utilize WMean to reduce computing burden.
重要性:量表通常由多项目问卷产生,但通常面临项目无响应的问题。传统的解决方案使用现有回答的加权平均值(WMean),但可能会忽略缺失数据的复杂性。多重估算(MI)等先进方法可以解决更广泛的缺失数据问题,但需要更多的计算资源。研究人员经常在 "我们所有人 "研究计划(All of Us)中使用调查数据,因此必须确定采用多重归因法处理非响应所增加的计算负担是否合理:本研究使用 All of Us 中的 5 项体育活动邻里环境量表 (PANES),评估了 WMean、MI 和反概率加权 (IPW) 在处理项目无响应时的功效和计算需求之间的权衡:在 PANES 中引入了 3 种缺失机制和不同缺失百分比(10%-50%)的合成缺失,允许 1 个或多个项目无响应。每种情况都比较了完整问题、MI 和 IPW 对偏差、变异性、覆盖概率和计算时间的影响:结果:所有方法都显示出最小偏差(在完整数据分析中分别比 WMean 和 IPW 长 8000 倍和 100 倍以上):在高可靠性量表中,MI 对项目无响应的性能优势微乎其微,但这并不能证明其在 "我们所有人 "中云计算负担的增加是值得的,尤其是在与计算要求极高的输入后分析相结合的情况下。使用低缺失率调查量表的研究人员可以利用 WMean 来减轻计算负担。
{"title":"Balancing efficacy and computational burden: weighted mean, multiple imputation, and inverse probability weighting methods for item non-response in reliable scales.","authors":"Andrew Guide, Shawn Garbett, Xiaoke Feng, Brandy M Mapes, Justin Cook, Lina Sulieman, Robert M Cronin, Qingxia Chen","doi":"10.1093/jamia/ocae217","DOIUrl":"https://doi.org/10.1093/jamia/ocae217","url":null,"abstract":"<p><strong>Importance: </strong>Scales often arise from multi-item questionnaires, yet commonly face item non-response. Traditional solutions use weighted mean (WMean) from available responses, but potentially overlook missing data intricacies. Advanced methods like multiple imputation (MI) address broader missing data, but demand increased computational resources. Researchers frequently use survey data in the All of Us Research Program (All of Us), and it is imperative to determine if the increased computational burden of employing MI to handle non-response is justifiable.</p><p><strong>Objectives: </strong>Using the 5-item Physical Activity Neighborhood Environment Scale (PANES) in All of Us, this study assessed the tradeoff between efficacy and computational demands of WMean, MI, and inverse probability weighting (IPW) when dealing with item non-response.</p><p><strong>Materials and methods: </strong>Synthetic missingness, allowing 1 or more item non-response, was introduced into PANES across 3 missing mechanisms and various missing percentages (10%-50%). Each scenario compared WMean of complete questions, MI, and IPW on bias, variability, coverage probability, and computation time.</p><p><strong>Results: </strong>All methods showed minimal biases (all <5.5%) for good internal consistency, with WMean suffered most with poor consistency. IPW showed considerable variability with increasing missing percentage. MI required significantly more computational resources, taking >8000 and >100 times longer than WMean and IPW in full data analysis, respectively.</p><p><strong>Discussion and conclusion: </strong>The marginal performance advantages of MI for item non-response in highly reliable scales do not warrant its escalated cloud computational burden in All of Us, particularly when coupled with computationally demanding post-imputation analyses. Researchers using survey scales with low missingness could utilize WMean to reduce computing burden.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141977130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Izabelle Humes, Cathy Shyr, Moira Dillon, Zhongjie Liu, Jennifer Peterson, Chris St Jeor, Jacqueline Malkes, Hiral Master, Brandy Mapes, Romuladus Azuine, Nakia Mack, Bassent Abdelbary, Joyonna Gamble-George, Emily Goldmann, Stephanie Cook, Fatemeh Choupani, Rubin Baskir, Sydney McMaster, Chris Lunt, Karriem Watson, Minnkyong Lee, Sophie Schwartz, Ruchi Munshi, David Glazer, Eric Banks, Anthony Philippakis, Melissa Basford, Dan Roden, Paul A Harris
Objectives: The All of Us Research Program is a precision medicine initiative aimed at establishing a vast, diverse biomedical database accessible through a cloud-based data analysis platform, the Researcher Workbench (RW). Our goal was to empower the research community by co-designing the implementation of SAS in the RW alongside researchers to enable broader use of All of Us data.
Materials and methods: Researchers from various fields and with different SAS experience levels participated in co-designing the SAS implementation through user experience interviews.
Results: Feedback and lessons learned from user testing informed the final design of the SAS application.
Discussion: The co-design approach is critical for reducing technical barriers, broadening All of Us data use, and enhancing the user experience for data analysis on the RW.
Conclusion: Our co-design approach successfully tailored the implementation of the SAS application to researchers' needs. This approach may inform future software implementations on the RW.
目标:我们所有人研究计划是一项精准医学计划,旨在建立一个庞大、多样的生物医学数据库,可通过基于云的数据分析平台--研究者工作台(RW)进行访问。我们的目标是通过与研究人员共同设计 RW 中 SAS 的实施来增强研究社区的能力,从而更广泛地使用 All of Us 数据:来自不同领域、具有不同 SAS 经验水平的研究人员通过用户体验访谈参与了 SAS 实施的共同设计:结果:从用户测试中获得的反馈和经验教训为 SAS 应用程序的最终设计提供了依据:讨论:共同设计方法对于减少技术障碍、扩大 "我们所有人 "数据的使用范围以及增强用户在 RW 上进行数据分析的体验至关重要:我们的共同设计方法成功地使 SAS 应用程序的实施符合研究人员的需求。这种方法可为未来在 RW 上实施软件提供参考。
{"title":"Empowering the biomedical research community: Innovative SAS deployment on the All of Us Researcher Workbench.","authors":"Izabelle Humes, Cathy Shyr, Moira Dillon, Zhongjie Liu, Jennifer Peterson, Chris St Jeor, Jacqueline Malkes, Hiral Master, Brandy Mapes, Romuladus Azuine, Nakia Mack, Bassent Abdelbary, Joyonna Gamble-George, Emily Goldmann, Stephanie Cook, Fatemeh Choupani, Rubin Baskir, Sydney McMaster, Chris Lunt, Karriem Watson, Minnkyong Lee, Sophie Schwartz, Ruchi Munshi, David Glazer, Eric Banks, Anthony Philippakis, Melissa Basford, Dan Roden, Paul A Harris","doi":"10.1093/jamia/ocae216","DOIUrl":"10.1093/jamia/ocae216","url":null,"abstract":"<p><strong>Objectives: </strong>The All of Us Research Program is a precision medicine initiative aimed at establishing a vast, diverse biomedical database accessible through a cloud-based data analysis platform, the Researcher Workbench (RW). Our goal was to empower the research community by co-designing the implementation of SAS in the RW alongside researchers to enable broader use of All of Us data.</p><p><strong>Materials and methods: </strong>Researchers from various fields and with different SAS experience levels participated in co-designing the SAS implementation through user experience interviews.</p><p><strong>Results: </strong>Feedback and lessons learned from user testing informed the final design of the SAS application.</p><p><strong>Discussion: </strong>The co-design approach is critical for reducing technical barriers, broadening All of Us data use, and enhancing the user experience for data analysis on the RW.</p><p><strong>Conclusion: </strong>Our co-design approach successfully tailored the implementation of the SAS application to researchers' needs. This approach may inform future software implementations on the RW.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141972205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carla McGruder, Kelly Tangney, Deanna Erwin, Jake Plewa, Karyn Onyeneho, Rhonda Moore, Anastasia Wise, Scott Topper, Alicia Y Zhou
Objective: This article outlines a scalable system developed by the All of Us Research Program's Genetic Counseling Resource to vet a large database of healthcare resources for supporting participants with health-related DNA results.
Materials and methods: After a literature review of established evaluation frameworks for health resources, we created SONAR, a 10-item framework and grading scale for health-related participant-facing resources. SONAR was used to review clinical resources that could be shared with participants during genetic counseling.
Results: Application of SONAR shortened resource approval time from 7 days to 1 day. About 256 resources were approved and 8 rejected through SONAR review. Most approved resources were relevant to participants nationwide (60.0%). The most common resource types were related to support groups (20%), cancer care (30.6%), and general educational resources (12.4%). All of Us genetic counselors provided 1161 approved resources during 3005 (38.6%) consults, mainly to local genetic counselors (29.9%), support groups (21.9%), and educational resources (21.0%).
Discussion: SONAR's systematic method simplifies resource vetting for healthcare providers, easing the burden of identifying and evaluating credible resources. Compiling these resources into a user-friendly database allows providers to share these resources efficiently, better equipping participants to complete follow up actions from health-related DNA results.
Conclusion: The All of Us Genetic Counseling Resource connects participants receiving health-related DNA results with relevant follow-up resources on a high-volume, national level. This has been made possible by the creation of a novel resource database and validation system.
{"title":"Sounding out solutions: using SONAR to connect participants with relevant healthcare resources.","authors":"Carla McGruder, Kelly Tangney, Deanna Erwin, Jake Plewa, Karyn Onyeneho, Rhonda Moore, Anastasia Wise, Scott Topper, Alicia Y Zhou","doi":"10.1093/jamia/ocae200","DOIUrl":"https://doi.org/10.1093/jamia/ocae200","url":null,"abstract":"<p><strong>Objective: </strong>This article outlines a scalable system developed by the All of Us Research Program's Genetic Counseling Resource to vet a large database of healthcare resources for supporting participants with health-related DNA results.</p><p><strong>Materials and methods: </strong>After a literature review of established evaluation frameworks for health resources, we created SONAR, a 10-item framework and grading scale for health-related participant-facing resources. SONAR was used to review clinical resources that could be shared with participants during genetic counseling.</p><p><strong>Results: </strong>Application of SONAR shortened resource approval time from 7 days to 1 day. About 256 resources were approved and 8 rejected through SONAR review. Most approved resources were relevant to participants nationwide (60.0%). The most common resource types were related to support groups (20%), cancer care (30.6%), and general educational resources (12.4%). All of Us genetic counselors provided 1161 approved resources during 3005 (38.6%) consults, mainly to local genetic counselors (29.9%), support groups (21.9%), and educational resources (21.0%).</p><p><strong>Discussion: </strong>SONAR's systematic method simplifies resource vetting for healthcare providers, easing the burden of identifying and evaluating credible resources. Compiling these resources into a user-friendly database allows providers to share these resources efficiently, better equipping participants to complete follow up actions from health-related DNA results.</p><p><strong>Conclusion: </strong>The All of Us Genetic Counseling Resource connects participants receiving health-related DNA results with relevant follow-up resources on a high-volume, national level. This has been made possible by the creation of a novel resource database and validation system.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141879672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}