Andrea Marheim Storås, Steffen Mæland, Jonas L Isaksen, Steven Alexander Hicks, Vajira Thambawita, Claus Graff, Hugo Lewi Hammer, Pål Halvorsen, Michael Alexander Riegler, Jørgen K Kanters
Objective: Evaluate popular explanation methods using heatmap visualizations to explain the predictions of deep neural networks for electrocardiogram (ECG) analysis and provide recommendations for selection of explanations methods.
Materials and methods: A residual deep neural network was trained on ECGs to predict intervals and amplitudes. Nine commonly used explanation methods (Saliency, Deconvolution, Guided backpropagation, Gradient SHAP, SmoothGrad, Input × gradient, DeepLIFT, Integrated gradients, GradCAM) were qualitatively evaluated by medical experts and objectively evaluated using a perturbation-based method.
Results: No single explanation method consistently outperformed the other methods, but some methods were clearly inferior. We found considerable disagreement between the human expert evaluation and the objective evaluation by perturbation.
Discussion: The best explanation method depended on the ECG measure. To ensure that future explanations of deep neural networks for medical data analyses are useful to medical experts, data scientists developing new explanation methods should collaborate tightly with domain experts. Because there is no explanation method that performs best in all use cases, several methods should be applied.
Conclusion: Several explanation methods should be used to determine the most suitable approach.
{"title":"Evaluating gradient-based explanation methods for neural network ECG analysis using heatmaps.","authors":"Andrea Marheim Storås, Steffen Mæland, Jonas L Isaksen, Steven Alexander Hicks, Vajira Thambawita, Claus Graff, Hugo Lewi Hammer, Pål Halvorsen, Michael Alexander Riegler, Jørgen K Kanters","doi":"10.1093/jamia/ocae280","DOIUrl":"https://doi.org/10.1093/jamia/ocae280","url":null,"abstract":"<p><strong>Objective: </strong>Evaluate popular explanation methods using heatmap visualizations to explain the predictions of deep neural networks for electrocardiogram (ECG) analysis and provide recommendations for selection of explanations methods.</p><p><strong>Materials and methods: </strong>A residual deep neural network was trained on ECGs to predict intervals and amplitudes. Nine commonly used explanation methods (Saliency, Deconvolution, Guided backpropagation, Gradient SHAP, SmoothGrad, Input × gradient, DeepLIFT, Integrated gradients, GradCAM) were qualitatively evaluated by medical experts and objectively evaluated using a perturbation-based method.</p><p><strong>Results: </strong>No single explanation method consistently outperformed the other methods, but some methods were clearly inferior. We found considerable disagreement between the human expert evaluation and the objective evaluation by perturbation.</p><p><strong>Discussion: </strong>The best explanation method depended on the ECG measure. To ensure that future explanations of deep neural networks for medical data analyses are useful to medical experts, data scientists developing new explanation methods should collaborate tightly with domain experts. Because there is no explanation method that performs best in all use cases, several methods should be applied.</p><p><strong>Conclusion: </strong>Several explanation methods should be used to determine the most suitable approach.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nate C Apathy, A Jay Holmgren, Paige Nong, Julia Adler-Milstein, Jordan Everson
Objectives: We analyzed trends in adoption of advanced patient engagement and clinical data analytics functionalities among critical access hospitals (CAHs) and non-CAHs to assess how historical gaps have changed.
Materials and methods: We used 2014, 2018, and 2023 data from the American Hospital Association Annual Survey IT Supplement to measure differences in adoption rates (ie, the "adoption gap") of patient engagement and clinical data analytics functionalities across CAHs and non-CAHs. We measured changes over time in CAH and non-CAH adoption of 6 "core" clinical data analytics functionalities, 5 "core" patient engagement functionalities, 5 new patient engagement functionalities, and 3 bulk data export use cases. We constructed 2 composite measures for core functionalities and analyzed adoption for other functionalities individually.
Results: Core functionality adoption increased from 21% of CAHs in 2014 to 56% in 2023 for clinical data analytics and 18% to 49% for patient engagement. The CAH adoption gap in both domains narrowed from 2018 to 2023 (both P < .01). More than 90% of all hospitals had adopted viewing and downloading electronic data and clinical notes by 2023. The largest CAH adoption gaps in 2023 were for Fast Healthcare Interoperability Resources (FHIR) bulk export use cases (eg, analytics and reporting: 63% of CAHs, 81% of non-CAHs, P < .001).
Discussion: Adoption of advanced electronic health record functionalities has increased for CAHs and non-CAHs, and some adoption gaps have been closed since 2018. However, CAHs may continue to struggle with clinical data analytics and FHIR-based functionalities.
Conclusion: Some crucial patient engagement functionalities have reached near-universal adoption; however, policymakers should consider programs to support CAHs in closing remaining adoption gaps.
{"title":"Trending in the right direction: critical access hospitals increased adoption of advanced electronic health record functions from 2018 to 2023.","authors":"Nate C Apathy, A Jay Holmgren, Paige Nong, Julia Adler-Milstein, Jordan Everson","doi":"10.1093/jamia/ocae267","DOIUrl":"https://doi.org/10.1093/jamia/ocae267","url":null,"abstract":"<p><strong>Objectives: </strong>We analyzed trends in adoption of advanced patient engagement and clinical data analytics functionalities among critical access hospitals (CAHs) and non-CAHs to assess how historical gaps have changed.</p><p><strong>Materials and methods: </strong>We used 2014, 2018, and 2023 data from the American Hospital Association Annual Survey IT Supplement to measure differences in adoption rates (ie, the \"adoption gap\") of patient engagement and clinical data analytics functionalities across CAHs and non-CAHs. We measured changes over time in CAH and non-CAH adoption of 6 \"core\" clinical data analytics functionalities, 5 \"core\" patient engagement functionalities, 5 new patient engagement functionalities, and 3 bulk data export use cases. We constructed 2 composite measures for core functionalities and analyzed adoption for other functionalities individually.</p><p><strong>Results: </strong>Core functionality adoption increased from 21% of CAHs in 2014 to 56% in 2023 for clinical data analytics and 18% to 49% for patient engagement. The CAH adoption gap in both domains narrowed from 2018 to 2023 (both P < .01). More than 90% of all hospitals had adopted viewing and downloading electronic data and clinical notes by 2023. The largest CAH adoption gaps in 2023 were for Fast Healthcare Interoperability Resources (FHIR) bulk export use cases (eg, analytics and reporting: 63% of CAHs, 81% of non-CAHs, P < .001).</p><p><strong>Discussion: </strong>Adoption of advanced electronic health record functionalities has increased for CAHs and non-CAHs, and some adoption gaps have been closed since 2018. However, CAHs may continue to struggle with clinical data analytics and FHIR-based functionalities.</p><p><strong>Conclusion: </strong>Some crucial patient engagement functionalities have reached near-universal adoption; however, policymakers should consider programs to support CAHs in closing remaining adoption gaps.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chelsea Richwine, Vaishali Patel, Jordan Everson, Bradley Iott
Objectives: To understand how health-related social needs (HRSN) data are collected at US hospitals and implications for use.
Materials and methods: Using 2023 nationally representative survey data on US hospitals (N = 2775), we described hospitals' routine and structured collection and use of HRSN data and examined the relationship between methods of data collection and specific uses. Multivariate logistic regression was used to identify characteristics associated with data collection and use and understand how methods of data collection relate to use.
Results: In 2023, 88% of hospitals collected HRSN data (64% routinely, 72% structured). While hospitals commonly used data for internal purposes (eg, discharge planning, 79%), those that collected data routinely and in a structured format (58%) used data for purposes involving coordination or exchange with other organizations (eg, making referrals, 74%) at higher rates than hospitals that collected data but not routinely or in a non-structured format (eg, 93% vs 67% for referrals, P< .05). In multivariate regression, routine and structured data collection was positively associated with all uses of data examined. Hospital location, ownership, system-affiliation, value-based care participation, and critical access designation were associated with HRSN data collection, but only system-affiliation was consistently (positively) associated with use.
Discussion: While most hospitals screen for social needs, fewer collect data routinely and in a structured format that would facilitate downstream use. Routine and structured data collection was associated with greater use, particularly for secondary purposes.
Conclusion: Routine and structured screening may result in more actionable data that facilitates use for various purposes that support patient care and improve community and population health, indicating the importance of continuing efforts to increase routine screening and standardize HRSN data collection.
{"title":"The role of routine and structured social needs data collection in improving care in US hospitals.","authors":"Chelsea Richwine, Vaishali Patel, Jordan Everson, Bradley Iott","doi":"10.1093/jamia/ocae279","DOIUrl":"https://doi.org/10.1093/jamia/ocae279","url":null,"abstract":"<p><strong>Objectives: </strong>To understand how health-related social needs (HRSN) data are collected at US hospitals and implications for use.</p><p><strong>Materials and methods: </strong>Using 2023 nationally representative survey data on US hospitals (N = 2775), we described hospitals' routine and structured collection and use of HRSN data and examined the relationship between methods of data collection and specific uses. Multivariate logistic regression was used to identify characteristics associated with data collection and use and understand how methods of data collection relate to use.</p><p><strong>Results: </strong>In 2023, 88% of hospitals collected HRSN data (64% routinely, 72% structured). While hospitals commonly used data for internal purposes (eg, discharge planning, 79%), those that collected data routinely and in a structured format (58%) used data for purposes involving coordination or exchange with other organizations (eg, making referrals, 74%) at higher rates than hospitals that collected data but not routinely or in a non-structured format (eg, 93% vs 67% for referrals, P< .05). In multivariate regression, routine and structured data collection was positively associated with all uses of data examined. Hospital location, ownership, system-affiliation, value-based care participation, and critical access designation were associated with HRSN data collection, but only system-affiliation was consistently (positively) associated with use.</p><p><strong>Discussion: </strong>While most hospitals screen for social needs, fewer collect data routinely and in a structured format that would facilitate downstream use. Routine and structured data collection was associated with greater use, particularly for secondary purposes.</p><p><strong>Conclusion: </strong>Routine and structured screening may result in more actionable data that facilitates use for various purposes that support patient care and improve community and population health, indicating the importance of continuing efforts to increase routine screening and standardize HRSN data collection.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142591563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Is ChatGPT worthy enough for provisioning clinical decision support?","authors":"Partha Pratim Ray","doi":"10.1093/jamia/ocae282","DOIUrl":"https://doi.org/10.1093/jamia/ocae282","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142583648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to: Artificial intelligence for optimizing recruitment and retention in clinical trials: a scoping review.","authors":"","doi":"10.1093/jamia/ocae283","DOIUrl":"https://doi.org/10.1093/jamia/ocae283","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142583537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amanda M Lam, Mariana C Singletary, Theresa Cullen
Objective: This communication presents the results of defining a tribal health jurisdiction by a combination of tribal affiliation (TA) and case address.
Materials and methods: Through a county-tribal partnership, Geographic Information System (GIS) software and custom code were used to extract tribal data from county data by identifying reservation addresses in county extracts of COVID-19 case records from December 30, 2019, to December 31, 2022 (n = 374 653) and COVID-19 vaccination records from December 1, 2020, to April 18, 2023 (n = 2 355 058).
Results: The tool identified 1.91 times as many case records and 3.76 times as many vaccination records as filtering by TA alone.
Discussion and conclusion: This method of identifying communities by patient address, in combination with TA and enrollment, can help tribal health jurisdictions attain equitable access to public health data, when done in partnership with a data sharing agreement. This methodology has potential applications for other populations underrepresented in public health and clinical research.
{"title":"A GIS software-based method to identify public health data belonging to address-defined communities.","authors":"Amanda M Lam, Mariana C Singletary, Theresa Cullen","doi":"10.1093/jamia/ocae235","DOIUrl":"10.1093/jamia/ocae235","url":null,"abstract":"<p><strong>Objective: </strong>This communication presents the results of defining a tribal health jurisdiction by a combination of tribal affiliation (TA) and case address.</p><p><strong>Materials and methods: </strong>Through a county-tribal partnership, Geographic Information System (GIS) software and custom code were used to extract tribal data from county data by identifying reservation addresses in county extracts of COVID-19 case records from December 30, 2019, to December 31, 2022 (n = 374 653) and COVID-19 vaccination records from December 1, 2020, to April 18, 2023 (n = 2 355 058).</p><p><strong>Results: </strong>The tool identified 1.91 times as many case records and 3.76 times as many vaccination records as filtering by TA alone.</p><p><strong>Discussion and conclusion: </strong>This method of identifying communities by patient address, in combination with TA and enrollment, can help tribal health jurisdictions attain equitable access to public health data, when done in partnership with a data sharing agreement. This methodology has potential applications for other populations underrepresented in public health and clinical research.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2716-2721"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11491637/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142057033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jejo D Koola, Karthik Ramesh, Jialin Mao, Minyoung Ahn, Sharon E Davis, Usha Govindarajulu, Amy M Perkins, Dax Westerman, Henry Ssemaganda, Theodore Speroff, Lucila Ohno-Machado, Craig R Ramsay, Art Sedrakyan, Frederic S Resnic, Michael E Matheny
Objectives: Traditional methods for medical device post-market surveillance often fail to accurately account for operator learning effects, leading to biased assessments of device safety. These methods struggle with non-linearity, complex learning curves, and time-varying covariates, such as physician experience. To address these limitations, we sought to develop a machine learning (ML) framework to detect and adjust for operator learning effects.
Materials and methods: A gradient-boosted decision tree ML method was used to analyze synthetic datasets that replicate the complexity of clinical scenarios involving high-risk medical devices. We designed this process to detect learning effects using a risk-adjusted cumulative sum method, quantify the excess adverse event rate attributable to operator inexperience, and adjust for these alongside patient factors in evaluating device safety signals. To maintain integrity, we employed blinding between data generation and analysis teams. Synthetic data used underlying distributions and patient feature correlations based on clinical data from the Department of Veterans Affairs between 2005 and 2012. We generated 2494 synthetic datasets with widely varying characteristics including number of patient features, operators and institutions, and the operator learning form. Each dataset contained a hypothetical study device, Device B, and a reference device, Device A. We evaluated accuracy in identifying learning effects and identifying and estimating the strength of the device safety signal. Our approach also evaluated different clinically relevant thresholds for safety signal detection.
Results: Our framework accurately identified the presence or absence of learning effects in 93.6% of datasets and correctly determined device safety signals in 93.4% of cases. The estimated device odds ratios' 95% confidence intervals were accurately aligned with the specified ratios in 94.7% of datasets. In contrast, a comparative model excluding operator learning effects significantly underperformed in detecting device signals and in accuracy. Notably, our framework achieved 100% specificity for clinically relevant safety signal thresholds, although sensitivity varied with the threshold applied.
Discussion: A machine learning framework, tailored for the complexities of post-market device evaluation, may provide superior performance compared to standard parametric techniques when operator learning is present.
Conclusion: Demonstrating the capacity of ML to overcome complex evaluative challenges, our framework addresses the limitations of traditional statistical methods in current post-market surveillance processes. By offering a reliable means to detect and adjust for learning effects, it may significantly improve medical device safety evaluation.
{"title":"A machine learning framework to adjust for learning effects in medical device safety evaluation.","authors":"Jejo D Koola, Karthik Ramesh, Jialin Mao, Minyoung Ahn, Sharon E Davis, Usha Govindarajulu, Amy M Perkins, Dax Westerman, Henry Ssemaganda, Theodore Speroff, Lucila Ohno-Machado, Craig R Ramsay, Art Sedrakyan, Frederic S Resnic, Michael E Matheny","doi":"10.1093/jamia/ocae273","DOIUrl":"https://doi.org/10.1093/jamia/ocae273","url":null,"abstract":"<p><strong>Objectives: </strong>Traditional methods for medical device post-market surveillance often fail to accurately account for operator learning effects, leading to biased assessments of device safety. These methods struggle with non-linearity, complex learning curves, and time-varying covariates, such as physician experience. To address these limitations, we sought to develop a machine learning (ML) framework to detect and adjust for operator learning effects.</p><p><strong>Materials and methods: </strong>A gradient-boosted decision tree ML method was used to analyze synthetic datasets that replicate the complexity of clinical scenarios involving high-risk medical devices. We designed this process to detect learning effects using a risk-adjusted cumulative sum method, quantify the excess adverse event rate attributable to operator inexperience, and adjust for these alongside patient factors in evaluating device safety signals. To maintain integrity, we employed blinding between data generation and analysis teams. Synthetic data used underlying distributions and patient feature correlations based on clinical data from the Department of Veterans Affairs between 2005 and 2012. We generated 2494 synthetic datasets with widely varying characteristics including number of patient features, operators and institutions, and the operator learning form. Each dataset contained a hypothetical study device, Device B, and a reference device, Device A. We evaluated accuracy in identifying learning effects and identifying and estimating the strength of the device safety signal. Our approach also evaluated different clinically relevant thresholds for safety signal detection.</p><p><strong>Results: </strong>Our framework accurately identified the presence or absence of learning effects in 93.6% of datasets and correctly determined device safety signals in 93.4% of cases. The estimated device odds ratios' 95% confidence intervals were accurately aligned with the specified ratios in 94.7% of datasets. In contrast, a comparative model excluding operator learning effects significantly underperformed in detecting device signals and in accuracy. Notably, our framework achieved 100% specificity for clinically relevant safety signal thresholds, although sensitivity varied with the threshold applied.</p><p><strong>Discussion: </strong>A machine learning framework, tailored for the complexities of post-market device evaluation, may provide superior performance compared to standard parametric techniques when operator learning is present.</p><p><strong>Conclusion: </strong>Demonstrating the capacity of ML to overcome complex evaluative challenges, our framework addresses the limitations of traditional statistical methods in current post-market surveillance processes. By offering a reliable means to detect and adjust for learning effects, it may significantly improve medical device safety evaluation.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142548633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Markus Ralf Bujotzek, Ünal Akünal, Stefan Denner, Peter Neher, Maximilian Zenk, Eric Frodl, Astha Jaiswal, Moon Kim, Nicolai R Krekiehn, Manuel Nickel, Richard Ruppel, Marcus Both, Felix Döllinger, Marcel Opitz, Thorsten Persigehl, Jens Kleesiek, Tobias Penzkofer, Klaus Maier-Hein, Andreas Bucher, Rickmer Braren
Objective: Federated Learning (FL) enables collaborative model training while keeping data locally. Currently, most FL studies in radiology are conducted in simulated environments due to numerous hurdles impeding its translation into practice. The few existing real-world FL initiatives rarely communicate specific measures taken to overcome these hurdles. To bridge this significant knowledge gap, we propose a comprehensive guide for real-world FL in radiology. Minding efforts to implement real-world FL, there is a lack of comprehensive assessments comparing FL to less complex alternatives in challenging real-world settings, which we address through extensive benchmarking.
Materials and methods: We developed our own FL infrastructure within the German Radiological Cooperative Network (RACOON) and demonstrated its functionality by training FL models on lung pathology segmentation tasks across six university hospitals. Insights gained while establishing our FL initiative and running the extensive benchmark experiments were compiled and categorized into the guide.
Results: The proposed guide outlines essential steps, identified hurdles, and implemented solutions for establishing successful FL initiatives conducting real-world experiments. Our experimental results prove the practical relevance of our guide and show that FL outperforms less complex alternatives in all evaluation scenarios.
Discussion and conclusion: Our findings justify the efforts required to translate FL into real-world applications by demonstrating advantageous performance over alternative approaches. Additionally, they emphasize the importance of strategic organization, robust management of distributed data and infrastructure in real-world settings. With the proposed guide, we are aiming to aid future FL researchers in circumventing pitfalls and accelerating translation of FL into radiological applications.
{"title":"Real-world federated learning in radiology: hurdles to overcome and benefits to gain.","authors":"Markus Ralf Bujotzek, Ünal Akünal, Stefan Denner, Peter Neher, Maximilian Zenk, Eric Frodl, Astha Jaiswal, Moon Kim, Nicolai R Krekiehn, Manuel Nickel, Richard Ruppel, Marcus Both, Felix Döllinger, Marcel Opitz, Thorsten Persigehl, Jens Kleesiek, Tobias Penzkofer, Klaus Maier-Hein, Andreas Bucher, Rickmer Braren","doi":"10.1093/jamia/ocae259","DOIUrl":"https://doi.org/10.1093/jamia/ocae259","url":null,"abstract":"<p><strong>Objective: </strong>Federated Learning (FL) enables collaborative model training while keeping data locally. Currently, most FL studies in radiology are conducted in simulated environments due to numerous hurdles impeding its translation into practice. The few existing real-world FL initiatives rarely communicate specific measures taken to overcome these hurdles. To bridge this significant knowledge gap, we propose a comprehensive guide for real-world FL in radiology. Minding efforts to implement real-world FL, there is a lack of comprehensive assessments comparing FL to less complex alternatives in challenging real-world settings, which we address through extensive benchmarking.</p><p><strong>Materials and methods: </strong>We developed our own FL infrastructure within the German Radiological Cooperative Network (RACOON) and demonstrated its functionality by training FL models on lung pathology segmentation tasks across six university hospitals. Insights gained while establishing our FL initiative and running the extensive benchmark experiments were compiled and categorized into the guide.</p><p><strong>Results: </strong>The proposed guide outlines essential steps, identified hurdles, and implemented solutions for establishing successful FL initiatives conducting real-world experiments. Our experimental results prove the practical relevance of our guide and show that FL outperforms less complex alternatives in all evaluation scenarios.</p><p><strong>Discussion and conclusion: </strong>Our findings justify the efforts required to translate FL into real-world applications by demonstrating advantageous performance over alternative approaches. Additionally, they emphasize the importance of strategic organization, robust management of distributed data and infrastructure in real-world settings. With the proposed guide, we are aiming to aid future FL researchers in circumventing pitfalls and accelerating translation of FL into radiological applications.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142512054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suma K Thareja, Xin Yang, Paramita Basak Upama, Aziz Abdullah, Shary Pérez Torres, Linda Jackson Cocroft, Michael Bubolz, Kari McGaughey, Xuelin Lou, Sailaja Kamaraju, Sheikh Iqbal Ahamed, Praveen Madiraju, Anne E Kwitek, Jeffrey Whittle, Zeno Franco
Objective: The NIH All of Us Research Program aims to advance personalized medicine by not only linking patient records, surveys, and genomic data but also engaging with participants, particularly from groups traditionally underrepresented in biomedical research (UBR). This study details how the dialogue between scientists and community members, including many from communities of color, shaped local research priorities.
Materials and methods: We recruited area quantitative, basic, and clinical scientists as well as community members from our Community and Participant Advisory Boards with a predetermined interest in All of Us research as members of a Special Interest Group (SIG). An expert community engagement scientist facilitated 6 SIG meetings over the year, explicitly fostering openness and flexibility during conversations. We qualitatively analyzed discussions using a social movement framework tailored for community-based participatory research (CBPR) mobilization.
Results: The SIG evolved through CBPR stages of emergence, coalescence, momentum, and maintenance/integration. Researchers prioritized community needs above personal academic interests while community members kept discussions focused on tangible return of value to communities. One key outcome includes SIG-driven shifts in programmatic and research priorities of the All of Us Research Program in Southeastern Wisconsin. One major challenge was building equitable conversations that balanced scientific rigor and community understanding.
Discussion: Our approach allowed for a rich dialogue to emerge. Points of connection and disconnection between community members and scientists offered important guidance for emerging areas of genomic inquiry.
Conclusion: Our study presents a robust foundation for future efforts to engage diverse communities in CBPR, particularly on healthcare concerns affecting UBR communities.
目标:美国国立卫生研究院(NIH)的 "我们所有人研究计划"(All of Us Research Program)旨在推动个性化医疗的发展,该计划不仅要将患者记录、调查和基因组数据联系起来,还要让参与者参与进来,尤其是那些传统上在生物医学研究领域代表性不足的群体(UBR)。本研究详细介绍了科学家与社区成员(包括许多来自有色人种社区的成员)之间的对话是如何影响当地研究重点的:我们从社区和参与者咨询委员会中招募了地区定量、基础和临床科学家以及对 "我们所有人 "研究有兴趣的社区成员,作为特别兴趣小组(SIG)的成员。在这一年中,一位社区参与科学家专家主持了 6 次 SIG 会议,明确提出要在对话中培养开放性和灵活性。我们使用为社区参与式研究(CBPR)动员量身定制的社会运动框架对讨论进行了定性分析:结果:SIG 经历了 CBPR 的兴起、凝聚、动力和维持/整合阶段。研究人员将社区需求置于个人学术利益之上,而社区成员则将讨论重点放在对社区的实际价值回报上。其中一项重要成果包括,在 SIG 的推动下,威斯康星州东南部的 "我们大家 "研究计划的计划和研究重点发生了变化。一个主要挑战是建立公平的对话,平衡科学的严谨性和社区的理解:我们的方法使丰富的对话得以出现。社区成员与科学家之间的联系点和脱节点为基因组研究的新兴领域提供了重要指导:我们的研究为今后让不同社区参与 CBPR,特别是影响 UBR 社区的医疗保健问题奠定了坚实的基础。
{"title":"Equitable community-based participatory research engagement with communities of color drives All of Us Wisconsin genomic research priorities.","authors":"Suma K Thareja, Xin Yang, Paramita Basak Upama, Aziz Abdullah, Shary Pérez Torres, Linda Jackson Cocroft, Michael Bubolz, Kari McGaughey, Xuelin Lou, Sailaja Kamaraju, Sheikh Iqbal Ahamed, Praveen Madiraju, Anne E Kwitek, Jeffrey Whittle, Zeno Franco","doi":"10.1093/jamia/ocae265","DOIUrl":"https://doi.org/10.1093/jamia/ocae265","url":null,"abstract":"<p><strong>Objective: </strong>The NIH All of Us Research Program aims to advance personalized medicine by not only linking patient records, surveys, and genomic data but also engaging with participants, particularly from groups traditionally underrepresented in biomedical research (UBR). This study details how the dialogue between scientists and community members, including many from communities of color, shaped local research priorities.</p><p><strong>Materials and methods: </strong>We recruited area quantitative, basic, and clinical scientists as well as community members from our Community and Participant Advisory Boards with a predetermined interest in All of Us research as members of a Special Interest Group (SIG). An expert community engagement scientist facilitated 6 SIG meetings over the year, explicitly fostering openness and flexibility during conversations. We qualitatively analyzed discussions using a social movement framework tailored for community-based participatory research (CBPR) mobilization.</p><p><strong>Results: </strong>The SIG evolved through CBPR stages of emergence, coalescence, momentum, and maintenance/integration. Researchers prioritized community needs above personal academic interests while community members kept discussions focused on tangible return of value to communities. One key outcome includes SIG-driven shifts in programmatic and research priorities of the All of Us Research Program in Southeastern Wisconsin. One major challenge was building equitable conversations that balanced scientific rigor and community understanding.</p><p><strong>Discussion: </strong>Our approach allowed for a rich dialogue to emerge. Points of connection and disconnection between community members and scientists offered important guidance for emerging areas of genomic inquiry.</p><p><strong>Conclusion: </strong>Our study presents a robust foundation for future efforts to engage diverse communities in CBPR, particularly on healthcare concerns affecting UBR communities.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142512053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Braja Gopal Patra, Lauren A Lepow, Praneet Kasi Reddy Jagadeesh Kumar, Veer Vekaria, Mohit Manoj Sharma, Prakash Adekkanattu, Brian Fennessy, Gavin Hynes, Isotta Landi, Jorge A Sanchez-Ruiz, Euijung Ryu, Joanna M Biernacka, Girish N Nadkarni, Ardesheer Talati, Myrna Weissman, Mark Olfson, J John Mann, Yiye Zhang, Alexander W Charney, Jyotishman Pathak
Objectives: Social support (SS) and social isolation (SI) are social determinants of health (SDOH) associated with psychiatric outcomes. In electronic health records (EHRs), individual-level SS/SI is typically documented in narrative clinical notes rather than as structured coded data. Natural language processing (NLP) algorithms can automate the otherwise labor-intensive process of extraction of such information.
Materials and methods: Psychiatric encounter notes from Mount Sinai Health System (MSHS, n = 300) and Weill Cornell Medicine (WCM, n = 225) were annotated to create a gold-standard corpus. A rule-based system (RBS) involving lexicons and a large language model (LLM) using FLAN-T5-XL were developed to identify mentions of SS and SI and their subcategories (eg, social network, instrumental support, and loneliness).
Results: For extracting SS/SI, the RBS obtained higher macroaveraged F1-scores than the LLM at both MSHS (0.89 versus 0.65) and WCM (0.85 versus 0.82). For extracting the subcategories, the RBS also outperformed the LLM at both MSHS (0.90 versus 0.62) and WCM (0.82 versus 0.81).
Discussion and conclusion: Unexpectedly, the RBS outperformed the LLMs across all metrics. An intensive review demonstrates that this finding is due to the divergent approach taken by the RBS and LLM. The RBS was designed and refined to follow the same specific rules as the gold-standard annotations. Conversely, the LLM was more inclusive with categorization and conformed to common English-language understanding. Both approaches offer advantages, although additional replication studies are warranted.
{"title":"Extracting social support and social isolation information from clinical psychiatry notes: comparing a rule-based natural language processing system and a large language model.","authors":"Braja Gopal Patra, Lauren A Lepow, Praneet Kasi Reddy Jagadeesh Kumar, Veer Vekaria, Mohit Manoj Sharma, Prakash Adekkanattu, Brian Fennessy, Gavin Hynes, Isotta Landi, Jorge A Sanchez-Ruiz, Euijung Ryu, Joanna M Biernacka, Girish N Nadkarni, Ardesheer Talati, Myrna Weissman, Mark Olfson, J John Mann, Yiye Zhang, Alexander W Charney, Jyotishman Pathak","doi":"10.1093/jamia/ocae260","DOIUrl":"https://doi.org/10.1093/jamia/ocae260","url":null,"abstract":"<p><strong>Objectives: </strong>Social support (SS) and social isolation (SI) are social determinants of health (SDOH) associated with psychiatric outcomes. In electronic health records (EHRs), individual-level SS/SI is typically documented in narrative clinical notes rather than as structured coded data. Natural language processing (NLP) algorithms can automate the otherwise labor-intensive process of extraction of such information.</p><p><strong>Materials and methods: </strong>Psychiatric encounter notes from Mount Sinai Health System (MSHS, n = 300) and Weill Cornell Medicine (WCM, n = 225) were annotated to create a gold-standard corpus. A rule-based system (RBS) involving lexicons and a large language model (LLM) using FLAN-T5-XL were developed to identify mentions of SS and SI and their subcategories (eg, social network, instrumental support, and loneliness).</p><p><strong>Results: </strong>For extracting SS/SI, the RBS obtained higher macroaveraged F1-scores than the LLM at both MSHS (0.89 versus 0.65) and WCM (0.85 versus 0.82). For extracting the subcategories, the RBS also outperformed the LLM at both MSHS (0.90 versus 0.62) and WCM (0.82 versus 0.81).</p><p><strong>Discussion and conclusion: </strong>Unexpectedly, the RBS outperformed the LLMs across all metrics. An intensive review demonstrates that this finding is due to the divergent approach taken by the RBS and LLM. The RBS was designed and refined to follow the same specific rules as the gold-standard annotations. Conversely, the LLM was more inclusive with categorization and conformed to common English-language understanding. Both approaches offer advantages, although additional replication studies are warranted.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142479224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}