Pub Date : 2024-02-26DOI: 10.1016/j.mcpdig.2024.02.001
Kevin Rajakariar MBBS , Paul Buntine MBBS , Andrew Ghaly MBBS , Zheng Cheng Zhu MBBS , Vihangi Abeygunawardana MD , Sarah Visakhamoorthy MBBS , Patrick J. Owen PhD , Shaun Tham MD , Liam Hackett MPH , Louise Roberts PhD , Jithin K. Sajeev MBBS, PhD , Nicholas Jones MBBS , Andrew W. Teh MBBS, PhD
Objective
To assess the ability of 2 commercially available smartwatches to accurately detect clinically significant hypoxia in patients hospitalized with coronavirus-19 (COVID-19).
Patients and Methods
A prospective multicenter validation study was performed from November 1, 2021, to August 31, 2022, assessing the Apple Watch Series 7 and Withings ScanWatch inbuilt pulse oximetry, against simultaneous ward-based oximetry as the reference standard. Patients hospitalized with active COVID-19 infection not requiring intensive care admission were recruited.
Results
A total of 750 smartwatch pulse oximetry measurements and 400 ward oximetry readings were successfully obtained from 200 patients (male 54%, age 66±18 years). For the detection of clinically significant hypoxia, the Apple Watch had a sensitivity and specificity of 34.8% and 97.5%, respectively with a positive predictive value of 78.1% and negative predictive value of 85.6%. The Withings ScanWatch had a sensitivity and specificity of 68.5% and 80.8%, respectively with a positive predictive value of 44.7% and negative predictive value of 91.9%. The overall accuracy was 84.9% for the Apple Watch and 78.5% for the Withings ScanWatch. The Spearman rank correlation coefficients reported a moderate correlation to ward-based photoplethysmography (Apple: rs=0.61; Withings: rs=0.51, both P<.01).
Conclusion
Although smartwatches are able to provide SpO2 readings, their overall accuracy may not be sufficient to replace the standard photoplethysmography technology in detecting hypoxia in patients with COVID-19.
{"title":"Accuracy of Smartwatch Pulse Oximetry Measurements in Hospitalized Patients With Coronavirus Disease 2019","authors":"Kevin Rajakariar MBBS , Paul Buntine MBBS , Andrew Ghaly MBBS , Zheng Cheng Zhu MBBS , Vihangi Abeygunawardana MD , Sarah Visakhamoorthy MBBS , Patrick J. Owen PhD , Shaun Tham MD , Liam Hackett MPH , Louise Roberts PhD , Jithin K. Sajeev MBBS, PhD , Nicholas Jones MBBS , Andrew W. Teh MBBS, PhD","doi":"10.1016/j.mcpdig.2024.02.001","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.02.001","url":null,"abstract":"<div><h3>Objective</h3><p>To assess the ability of 2 commercially available smartwatches to accurately detect clinically significant hypoxia in patients hospitalized with coronavirus-19 (COVID-19).</p></div><div><h3>Patients and Methods</h3><p>A prospective multicenter validation study was performed from November 1, 2021, to August 31, 2022, assessing the Apple Watch Series 7 and Withings ScanWatch inbuilt pulse oximetry, against simultaneous ward-based oximetry as the reference standard. Patients hospitalized with active COVID-19 infection not requiring intensive care admission were recruited.</p></div><div><h3>Results</h3><p>A total of 750 smartwatch pulse oximetry measurements and 400 ward oximetry readings were successfully obtained from 200 patients (male 54%, age 66±18 years). For the detection of clinically significant hypoxia, the Apple Watch had a sensitivity and specificity of 34.8% and 97.5%, respectively with a positive predictive value of 78.1% and negative predictive value of 85.6%. The Withings ScanWatch had a sensitivity and specificity of 68.5% and 80.8%, respectively with a positive predictive value of 44.7% and negative predictive value of 91.9%. The overall accuracy was 84.9% for the Apple Watch and 78.5% for the Withings ScanWatch. The Spearman rank correlation coefficients reported a moderate correlation to ward-based photoplethysmography (Apple: r<sub>s</sub>=0.61; Withings: r<sub>s</sub>=0.51, both <em>P</em><.01).</p></div><div><h3>Conclusion</h3><p>Although smartwatches are able to provide SpO<sub>2</sub> readings, their overall accuracy may not be sufficient to replace the standard photoplethysmography technology in detecting hypoxia in patients with COVID-19.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 152-158"},"PeriodicalIF":0.0,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000105/pdfft?md5=dbf3bf07a6737561ec1ad6f4adb7fdcd&pid=1-s2.0-S2949761224000105-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139985368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-25DOI: 10.1016/j.mcpdig.2024.01.007
Srinivasan S. Pillay MD , Patrick Candela BA , Ivana T. Croghan PhD , Ryan T. Hurt MD, PhD , Sara L. Bonnes MD, MS , Ravindra Ganesh MBBS, MD , Brent A. Bauer MD
In this review, we describe evidence that supports building a metaverse to promote healthy longevity. We propose that the metaverse offers several physical advantages (architecture, music, and nature), social (accessibility, affordability, community-building, and relief of social anxiety), and therapeutic (immersive, anti-inflammatory, and adjunctive use in complementary and integrative medicine). Lifelogging by patients may help clinicians personalize interventions by matching data to therapeutic outcomes. Although the metaverse cannot entirely replace our current model of care, a strategic approach will ensure adequate resource allocation and value assessment. In a collaborative effort between Reulay, Inc and Mayo Clinic, we are building a platform for the delivery of personalized and idiographic interventions to promote healthy longevity. To this end, we are using specific science-informed art design to reduce stress and anxiety for patients, with the progressive addition of integrated care elements that connect to this framework and connect treatment response to biomarkers that are relevant to healthy longevity. This review is a commentary on the thought process behind this effort.
{"title":"Leveraging the Metaverse for Enhanced Longevity as a Component of Health 4.0","authors":"Srinivasan S. Pillay MD , Patrick Candela BA , Ivana T. Croghan PhD , Ryan T. Hurt MD, PhD , Sara L. Bonnes MD, MS , Ravindra Ganesh MBBS, MD , Brent A. Bauer MD","doi":"10.1016/j.mcpdig.2024.01.007","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.01.007","url":null,"abstract":"<div><p>In this review, we describe evidence that supports building a metaverse to promote healthy longevity. We propose that the metaverse offers several physical advantages (architecture, music, and nature), social (accessibility, affordability, community-building, and relief of social anxiety), and therapeutic (immersive, anti-inflammatory, and adjunctive use in complementary and integrative medicine). Lifelogging by patients may help clinicians personalize interventions by matching data to therapeutic outcomes. Although the metaverse cannot entirely replace our current model of care, a strategic approach will ensure adequate resource allocation and value assessment. In a collaborative effort between Reulay, Inc and Mayo Clinic, we are building a platform for the delivery of personalized and idiographic interventions to promote healthy longevity. To this end, we are using specific science-informed art design to reduce stress and anxiety for patients, with the progressive addition of integrated care elements that connect to this framework and connect treatment response to biomarkers that are relevant to healthy longevity. This review is a commentary on the thought process behind this effort.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 139-151"},"PeriodicalIF":0.0,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000087/pdfft?md5=fe65aacfc7a505e1acf93b8a7a7b844e&pid=1-s2.0-S2949761224000087-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139944992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-23DOI: 10.1016/j.mcpdig.2024.01.006
Matthew R. Hall MD , Alexander D. Weston PhD , Mikolaj A. Wieczorek BA , Misty M. Hobbs MD , Maria A. Caruso BA , Habeeba Siddiqui BA , Laura M. Pacheco-Spann MS , Johanny L. Lopez-Dominguez MD , Coralle Escoda-Diaz BA , Rickey E. Carter PhD , Charles J. Bruce MB, ChB
Objective
To develop a deep learning algorithm for the analysis of patch testing.
Patients and Methods
A retrospective case series between January 1, 2010, and December 31, 2020, was constructed to develop a deep learning model for the classification of patch test results from photographs. The performance of human expert readers reviewing the same photographs blinded to the original clinical physical examination findings was measured to benchmark model performance.
Results
Model performance on the independent test set (n=5070 test site locations from 37 patients) achieved an area under the receiver operating characteristic curve of 0.89 (95% CI, 0.86-0.91) and an F1 score of 37.1. The optimal cutoff had a sensitivity of 70.1% (136/194; 95% CI, 63.1%-76.5%) and a specificity of 91.7% (4472/4876; 95% CI, 90.9%-92.5%).
Conclusion
We demonstrated proof-of-concept utility for detecting allergic contact dermatitis using an automated deep learning approach.
{"title":"An Automated Approach for Diagnosing Allergic Contact Dermatitis Using Deep Learning to Support Democratization of Patch Testing","authors":"Matthew R. Hall MD , Alexander D. Weston PhD , Mikolaj A. Wieczorek BA , Misty M. Hobbs MD , Maria A. Caruso BA , Habeeba Siddiqui BA , Laura M. Pacheco-Spann MS , Johanny L. Lopez-Dominguez MD , Coralle Escoda-Diaz BA , Rickey E. Carter PhD , Charles J. Bruce MB, ChB","doi":"10.1016/j.mcpdig.2024.01.006","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.01.006","url":null,"abstract":"<div><h3>Objective</h3><p>To develop a deep learning algorithm for the analysis of patch testing.</p></div><div><h3>Patients and Methods</h3><p>A retrospective case series between January 1, 2010, and December 31, 2020, was constructed to develop a deep learning model for the classification of patch test results from photographs. The performance of human expert readers reviewing the same photographs blinded to the original clinical physical examination findings was measured to benchmark model performance.</p></div><div><h3>Results</h3><p>Model performance on the independent test set (n=5070 test site locations from 37 patients) achieved an area under the receiver operating characteristic curve of 0.89 (95% CI, 0.86-0.91) and an F1 score of 37.1. The optimal cutoff had a sensitivity of 70.1% (136/194; 95% CI, 63.1%-76.5%) and a specificity of 91.7% (4472/4876; 95% CI, 90.9%-92.5%).</p></div><div><h3>Conclusion</h3><p>We demonstrated proof-of-concept utility for detecting allergic contact dermatitis using an automated deep learning approach.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 131-138"},"PeriodicalIF":0.0,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000075/pdfft?md5=ec00a6cd5a159970ea4a21e054923ea6&pid=1-s2.0-S2949761224000075-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139942544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-17DOI: 10.1016/j.mcpdig.2024.02.002
Gianrico Farrugia MD
{"title":"A Transformative Future for Health Care: On the First Year of Mayo Clinic Proceedings: Digital Health","authors":"Gianrico Farrugia MD","doi":"10.1016/j.mcpdig.2024.02.002","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.02.002","url":null,"abstract":"","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 129-130"},"PeriodicalIF":0.0,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000117/pdfft?md5=cbd982a82d4ca62307430904aba1007a&pid=1-s2.0-S2949761224000117-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139748980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.1016/j.mcpdig.2024.01.005
Rajeev V. Rikhye PhD , Grace Eunhae Hong BA , Preeti Singh MS , Margaret Ann Smith MBA , Aaron Loh MS , Vijaytha Muralidharan MD , Doris Wong BS , Rory Sayres PhD , Michelle Phung MS , Nicolas Betancourt MD , Bradley Fong BS , Rachna Sahasrabudhe BA , Khoban Nasim BS , Alec Eschholz BA , Yossi Matias PhD , Greg S. Corrado PhD , Katherine Chou MS , Dale R. Webster PhD , Peggy Bui MD, MBA , Yuan Liu PhD , Steven Lin MD
Objective
To understand and highlight the differences in clinical, demographic, and image quality characteristics between patient-taken (PAT) and clinic-taken (CLIN) photographs of skin conditions.
Patients and Methods
This retrospective study applied logistic regression to data from 2500 deidentified cases in Stanford Health Care’s eConsult system, from November 2015 to January 2021. Cases with undiagnosable or multiple conditions or cases with both patient and clinician image sources were excluded, leaving 628 PAT cases and 1719 CLIN cases. Demographic characteristic factors, such as age and sex were self-reported, whereas anatomic location, estimated skin type, clinical signs and symptoms, condition duration, and condition frequency were summarized from patient health records. Image quality variables such as blur, lighting issues and whether the image contained skin, hair, or nails were estimated through a deep learning model.
Results
Factors that were positively associated with CLIN photographs, post-2020 were as follows: age 60 years or older, darker skin types (eFST V/VI), and presence of skin growths. By contrast, factors that were positively associated with PAT photographs include conditions appearing intermittently, cases with blurry photographs, photographs with substantial nonskin (or nail/hair) regions and cases with more than 3 photographs. Within the PAT cohort, older age was associated with blurry photographs.
Conclusion
There are various demographic, clinical, and image quality characteristic differences between PAT and CLIN photographs of skin concerns. The demographic characteristic differences present important considerations for improving digital literacy or access, whereas the image quality differences point to the need for improved patient education and better image capture workflows, particularly among elderly patients.
{"title":"Differences Between Patient and Clinician-Taken Images: Implications for Virtual Care of Skin Conditions","authors":"Rajeev V. Rikhye PhD , Grace Eunhae Hong BA , Preeti Singh MS , Margaret Ann Smith MBA , Aaron Loh MS , Vijaytha Muralidharan MD , Doris Wong BS , Rory Sayres PhD , Michelle Phung MS , Nicolas Betancourt MD , Bradley Fong BS , Rachna Sahasrabudhe BA , Khoban Nasim BS , Alec Eschholz BA , Yossi Matias PhD , Greg S. Corrado PhD , Katherine Chou MS , Dale R. Webster PhD , Peggy Bui MD, MBA , Yuan Liu PhD , Steven Lin MD","doi":"10.1016/j.mcpdig.2024.01.005","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.01.005","url":null,"abstract":"<div><h3>Objective</h3><p>To understand and highlight the differences in clinical, demographic, and image quality characteristics between patient-taken (PAT) and clinic-taken (CLIN) photographs of skin conditions.</p></div><div><h3>Patients and Methods</h3><p>This retrospective study applied logistic regression to data from 2500 deidentified cases in Stanford Health Care’s eConsult system, from November 2015 to January 2021. Cases with undiagnosable or multiple conditions or cases with both patient and clinician image sources were excluded, leaving 628 PAT cases and 1719 CLIN cases. Demographic characteristic factors, such as age and sex were self-reported, whereas anatomic location, estimated skin type, clinical signs and symptoms, condition duration, and condition frequency were summarized from patient health records. Image quality variables such as blur, lighting issues and whether the image contained skin, hair, or nails were estimated through a deep learning model.</p></div><div><h3>Results</h3><p>Factors that were positively associated with CLIN photographs, post-2020 were as follows: age 60 years or older, darker skin types (eFST V/VI), and presence of skin growths. By contrast, factors that were positively associated with PAT photographs include conditions appearing intermittently, cases with blurry photographs, photographs with substantial nonskin (or nail/hair) regions and cases with more than 3 photographs. Within the PAT cohort, older age was associated with blurry photographs.</p></div><div><h3>Conclusion</h3><p>There are various demographic, clinical, and image quality characteristic differences between PAT and CLIN photographs of skin concerns. The demographic characteristic differences present important considerations for improving digital literacy or access, whereas the image quality differences point to the need for improved patient education and better image capture workflows, particularly among elderly patients.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 107-118"},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000063/pdfft?md5=b6821d4312bb7e3ec9c3c66208aec937&pid=1-s2.0-S2949761224000063-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139738259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.1016/j.mcpdig.2024.01.004
Naotaka Usui MD, PhD
{"title":"Untapped Potential of Artificial Intelligence for Analysis of Epileptic Seizure Videos: A Clinician’s Expectation","authors":"Naotaka Usui MD, PhD","doi":"10.1016/j.mcpdig.2024.01.004","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.01.004","url":null,"abstract":"","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 104-106"},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000051/pdfft?md5=c551603f1e01d547a79eec0bbf642cfc&pid=1-s2.0-S2949761224000051-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139738298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.1016/j.mcpdig.2023.12.002
Anca Chiriac MD, PhD , Che Ngufor PhD , Holly K. van Houten BA , Raphael Mwangi MS , Malini Madhavan MBBS , Peter A. Noseworthy MD , Samuel J. Asirvatham MD , Sabrina D. Phillips MD , Christopher J. McLeod MB ChB, PhD
Objective
To develop and validate a robust risk prediction model for stroke and systemic embolism (SSE) in adult patients with congenital heart disease (ACHD), using artificial intelligence.
Patients and Methods
Deidentified insurance claims from the Optum Labs Data Warehouse, including enrollment records and medical and pharmacy claims for commercial and Medicare Advantage enrollees, were used to identify 49,276 patients with ACHD, followed between January 1, 2009, and December 31, 2014. The group was randomly divided into development (70%) and validation (30%) cohorts. The development cohort was used to train 2 machine learning (ML) algorithms, regularized Cox regression (RegCox), and extreme gradient boosting (XGBoost) to predict SSE at 1, 2, and 5 years. The Shapley additive explanations (SHAP) model was used to identify the variables particularly driving the SSE risk.
Results
Within this large and diverse cohort of patients with ACHD (mean age, 59 ± 19 years; 25,390 (51.5%) female, 35,766 [77.6%]) white), 1756 (3.6%) patients experienced SSE during follow-up. In the Validation cohort, CHA2DS2-VASC had an area under the receiver operating characteristics curve (AUC) of 0.66 for predicting SSE at 1-,2, and 5-years. RegCox had the best predictive performance, with AUCs of 0.82,.81, and.80 at 1-, 2, and 5-years. XGBoost had AUCs of 0.81, 0.80, and 0.79 respectively. Atrial septal defect (ASD) emerged as an important predictor for SSE uncovered by the unbiased ML algorithms. A new clinical risk score, the CHA2DS2-VASC-ASD2 score, provides improved SSE prediction in ACHD. Yet, the ML models still outperformed this.
Conclusion
ML models significantly outperformed the clinical risk scores in patients with ACHD.
{"title":"Beyond Atrial Fibrillation: Machine Learning Algorithm Predicts Stroke in Adult Patients With Congenital Heart Disease","authors":"Anca Chiriac MD, PhD , Che Ngufor PhD , Holly K. van Houten BA , Raphael Mwangi MS , Malini Madhavan MBBS , Peter A. Noseworthy MD , Samuel J. Asirvatham MD , Sabrina D. Phillips MD , Christopher J. McLeod MB ChB, PhD","doi":"10.1016/j.mcpdig.2023.12.002","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2023.12.002","url":null,"abstract":"<div><h3>Objective</h3><p>To develop and validate a robust risk prediction model for stroke and systemic embolism (SSE) in adult patients with congenital heart disease (ACHD), using artificial intelligence.</p></div><div><h3>Patients and Methods</h3><p>Deidentified insurance claims from the Optum Labs Data Warehouse, including enrollment records and medical and pharmacy claims for commercial and Medicare Advantage enrollees, were used to identify 49,276 patients with ACHD, followed between January 1, 2009, and December 31, 2014. The group was randomly divided into development (70%) and validation (30%) cohorts. The development cohort was used to train 2 machine learning (ML) algorithms, regularized Cox regression (RegCox), and extreme gradient boosting (XGBoost) to predict SSE at 1, 2, and 5 years. The Shapley additive explanations (SHAP) model was used to identify the variables particularly driving the SSE risk.</p></div><div><h3>Results</h3><p>Within this large and diverse cohort of patients with ACHD (mean age, 59 ± 19 years; 25,390 (51.5%) female, 35,766 [77.6%]) white), 1756 (3.6%) patients experienced SSE during follow-up. In the Validation cohort, CHA<sub>2</sub>DS<sub>2</sub>-VASC had an area under the receiver operating characteristics curve (AUC) of 0.66 for predicting SSE at 1-,2, and 5-years. RegCox had the best predictive performance, with AUCs of 0.82,.81, and.80 at 1-, 2, and 5-years. XGBoost had AUCs of 0.81, 0.80, and 0.79 respectively. Atrial septal defect (ASD) emerged as an important predictor for SSE uncovered by the unbiased ML algorithms. A new clinical risk score, the CHA<sub>2</sub>DS<sub>2</sub>-VASC-ASD<sub>2</sub> score, provides improved SSE prediction in ACHD. Yet, the ML models still outperformed this.</p></div><div><h3>Conclusion</h3><p>ML models significantly outperformed the clinical risk scores in patients with ACHD.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 92-103"},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000026/pdfft?md5=c34fed3977be03552486d0740a93fe5f&pid=1-s2.0-S2949761224000026-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139738326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-15DOI: 10.1016/j.mcpdig.2024.01.003
Prashant D. Tailor MD , Timothy T. Xu MD , Blake H. Fortes MD , Raymond Iezzi MD , Timothy W. Olsen MD , Matthew R. Starr MD , Sophie J. Bakri MD , Brittni A. Scruggs MD, PhD , Andrew J. Barkmeier MD , Sanjay V. Patel MD , Keith H. Baratz MD , Ashlie A. Bernhisel MD , Lilly H. Wagner MD , Andrea A. Tooley MD , Gavin W. Roddy MD, PhD , Arthur J. Sit MD , Kristi Y. Wu MD , Erick D. Bothun MD , Sasha A. Mansukhani MBBS , Brian G. Mohney MD , Lauren A. Dalvin MD
Objective
To determine the appropriateness of ophthalmology recommendations from an online chat-based artificial intelligence model to ophthalmology questions.
Patients and Methods
Cross-sectional qualitative study from April 1, 2023, to April 30, 2023. A total of 192 questions were generated spanning all ophthalmic subspecialties. Each question was posed to a large language model (LLM) 3 times. The responses were graded by appropriate subspecialists as appropriate, inappropriate, or unreliable in 2 grading contexts. The first grading context was if the information was presented on a patient information site. The second was an LLM-generated draft response to patient queries sent by the electronic medical record (EMR). Appropriate was defined as accurate and specific enough to serve as a surrogate for physician-approved information. Main outcome measure was percentage of appropriate responses per subspecialty.
Results
For patient information site-related questions, the LLM provided an overall average of 79% appropriate responses. Variable rates of average appropriateness were observed across ophthalmic subspecialties for patient information site information ranging from 56% to 100%: cataract or refractive (92%), cornea (56%), glaucoma (72%), neuro-ophthalmology (67%), oculoplastic or orbital surgery (80%), ocular oncology (100%), pediatrics (89%), vitreoretinal diseases (86%), and uveitis (65%). For draft responses to patient questions via EMR, the LLM provided an overall average of 74% appropriate responses and varied by subspecialty: cataract or refractive (85%), cornea (54%), glaucoma (77%), neuro-ophthalmology (63%), oculoplastic or orbital surgery (62%), ocular oncology (90%), pediatrics (94%), vitreoretinal diseases (88%), and uveitis (55%). Stratifying grades across health information categories (disease and condition, risk and prevention, surgery-related, and treatment and management) showed notable but insignificant variations, with disease and condition often rated highest (72% and 69%) for appropriateness and surgery-related (55% and 51%) lowest, in both contexts.
Conclusion
This LLM reported mostly appropriate responses across multiple ophthalmology subspecialties in the context of both patient information sites and EMR-related responses to patient questions. Current LLM offerings require optimization and improvement before widespread clinical use.
{"title":"Appropriateness of Ophthalmology Recommendations From an Online Chat-Based Artificial Intelligence Model","authors":"Prashant D. Tailor MD , Timothy T. Xu MD , Blake H. Fortes MD , Raymond Iezzi MD , Timothy W. Olsen MD , Matthew R. Starr MD , Sophie J. Bakri MD , Brittni A. Scruggs MD, PhD , Andrew J. Barkmeier MD , Sanjay V. Patel MD , Keith H. Baratz MD , Ashlie A. Bernhisel MD , Lilly H. Wagner MD , Andrea A. Tooley MD , Gavin W. Roddy MD, PhD , Arthur J. Sit MD , Kristi Y. Wu MD , Erick D. Bothun MD , Sasha A. Mansukhani MBBS , Brian G. Mohney MD , Lauren A. Dalvin MD","doi":"10.1016/j.mcpdig.2024.01.003","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.01.003","url":null,"abstract":"<div><h3>Objective</h3><p>To determine the appropriateness of ophthalmology recommendations from an online chat-based artificial intelligence model to ophthalmology questions.</p></div><div><h3>Patients and Methods</h3><p>Cross-sectional qualitative study from April 1, 2023, to April 30, 2023. A total of 192 questions were generated spanning all ophthalmic subspecialties. Each question was posed to a large language model (LLM) 3 times. The responses were graded by appropriate subspecialists as appropriate, inappropriate, or unreliable in 2 grading contexts. The first grading context was if the information was presented on a patient information site. The second was an LLM-generated draft response to patient queries sent by the electronic medical record (EMR). Appropriate was defined as accurate and specific enough to serve as a surrogate for physician-approved information. Main outcome measure was percentage of appropriate responses per subspecialty.</p></div><div><h3>Results</h3><p>For patient information site-related questions, the LLM provided an overall average of 79% appropriate responses. Variable rates of average appropriateness were observed across ophthalmic subspecialties for patient information site information ranging from 56% to 100%: cataract or refractive (92%), cornea (56%), glaucoma (72%), neuro-ophthalmology (67%), oculoplastic or orbital surgery (80%), ocular oncology (100%), pediatrics (89%), vitreoretinal diseases (86%), and uveitis (65%). For draft responses to patient questions via EMR, the LLM provided an overall average of 74% appropriate responses and varied by subspecialty: cataract or refractive (85%), cornea (54%), glaucoma (77%), neuro-ophthalmology (63%), oculoplastic or orbital surgery (62%), ocular oncology (90%), pediatrics (94%), vitreoretinal diseases (88%), and uveitis (55%). Stratifying grades across health information categories (disease and condition, risk and prevention, surgery-related, and treatment and management) showed notable but insignificant variations, with disease and condition often rated highest (72% and 69%) for appropriateness and surgery-related (55% and 51%) lowest, in both contexts.</p></div><div><h3>Conclusion</h3><p>This LLM reported mostly appropriate responses across multiple ophthalmology subspecialties in the context of both patient information sites and EMR-related responses to patient questions. Current LLM offerings require optimization and improvement before widespread clinical use.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 119-128"},"PeriodicalIF":0.0,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294976122400004X/pdfft?md5=5523855f19c376cfc730f0de31cbe918&pid=1-s2.0-S294976122400004X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139738261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-04DOI: 10.1016/j.mcpdig.2024.01.002
Ting-Wei Wang MD, PhD , Yu-Chieh Shiao MD , Jia-Sheng Hong PhD , Wei-Kai Lee PhD , Ming-Sheng Hsu MD , Hao-Min Cheng MD, PhD , Huai-Che Yang MD, PhD , Cheng-Chia Lee MD, PhD , Hung-Chuan Pan MD, PhD , Weir Chiang You MD, PhD , Jiing-Feng Lirng MD , Wan-Yuo Guo MD, PhD , Yu-Te Wu PhD
Objective
To thoroughly analyze factors affecting the generalization ability of deep learning algorithms on brain tumor detection and segmentation models.
Patients and Methods
We searched PubMed, Embase, Web of Science, Cochrane Library, and IEEE from inception to July 25, 2023, and 19 studies with 12,000 patients were identified. The criteria required studies to use magnetic resonance imaging (MRI) for brain tumor detection and segmentation, offer clear performance metrics, and use external validation data sets. The study focused on outcomes such as sensitivity and Dice score. Study quality was assessed using QUADAS-2 and CLAIM tools. The meta-analysis evaluated varying algorithms and their performance across different validation data sets.
Results
MRI hardware as the manufacturer may contribute to data set diversity, impacting AI model generalizability. The study found that the best algorithms had a pooled lesion-wise Dice score of 84%, with pooled sensitivities of 87% (patient-wise) and 86% (lesion-wise). Post-2022 methodologies highlighted evolving artificial intelligence techniques. Performance differences were evident among tumor types, likely due to size disparities. 3D models outperformed their 2D and ensemble counterparts in detection. Although specific preprocessing techniques improved segmentation outcomes, some hindered detection.
Conclusion
The study underscores the potential of deep learning in improving brain tumor diagnostics and treatment planning. We also identify the need for further research, including developing a comprehensive diversity index, expanded meta-analyses, and using generative adversarial networks for data diversification, paving the way for AI-driven advancements in oncological patient care.
{"title":"Artificial Intelligence Detection and Segmentation Models: A Systematic Review and Meta-Analysis of Brain Tumors in Magnetic Resonance Imaging","authors":"Ting-Wei Wang MD, PhD , Yu-Chieh Shiao MD , Jia-Sheng Hong PhD , Wei-Kai Lee PhD , Ming-Sheng Hsu MD , Hao-Min Cheng MD, PhD , Huai-Che Yang MD, PhD , Cheng-Chia Lee MD, PhD , Hung-Chuan Pan MD, PhD , Weir Chiang You MD, PhD , Jiing-Feng Lirng MD , Wan-Yuo Guo MD, PhD , Yu-Te Wu PhD","doi":"10.1016/j.mcpdig.2024.01.002","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.01.002","url":null,"abstract":"<div><h3>Objective</h3><p>To thoroughly analyze factors affecting the generalization ability of deep learning algorithms on brain tumor detection and segmentation models.</p></div><div><h3>Patients and Methods</h3><p>We searched PubMed, Embase, Web of Science, Cochrane Library, and IEEE from inception to July 25, 2023, and 19 studies with 12,000 patients were identified. The criteria required studies to use magnetic resonance imaging (MRI) for brain tumor detection and segmentation, offer clear performance metrics, and use external validation data sets. The study focused on outcomes such as sensitivity and Dice score. Study quality was assessed using QUADAS-2 and CLAIM tools. The meta-analysis evaluated varying algorithms and their performance across different validation data sets.</p></div><div><h3>Results</h3><p>MRI hardware as the manufacturer may contribute to data set diversity, impacting AI model generalizability. The study found that the best algorithms had a pooled lesion-wise Dice score of 84%, with pooled sensitivities of 87% (patient-wise) and 86% (lesion-wise). Post-2022 methodologies highlighted evolving artificial intelligence techniques. Performance differences were evident among tumor types, likely due to size disparities. 3D models outperformed their 2D and ensemble counterparts in detection. Although specific preprocessing techniques improved segmentation outcomes, some hindered detection.</p></div><div><h3>Conclusion</h3><p>The study underscores the potential of deep learning in improving brain tumor diagnostics and treatment planning. We also identify the need for further research, including developing a comprehensive diversity index, expanded meta-analyses, and using generative adversarial networks for data diversification, paving the way for AI-driven advancements in oncological patient care.</p></div><div><h3>Trial Registration</h3><p>PROPERO (CRD42023459108).</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 75-91"},"PeriodicalIF":0.0,"publicationDate":"2024-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000038/pdfft?md5=462accb0c195aebed809efe8ef0de1df&pid=1-s2.0-S2949761224000038-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139675998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.1016/j.mcpdig.2024.01.001
Cristian Soto Jacome MD , Danny Segura Torres MD , Jungwei W. Fan PhD , Ricardo Loor-Torres MD , Mayra Duran MD , Misk Al Zahidy MS , Esteban Cabezas MD , Mariana Borras-Osorio MD , David Toro-Tobon MD , Yuqi Wu PhD , Yonghui Wu PhD , Naykky Singh Ospina MD, MS , Juan P. Brito MD, MS
Objective
To address thyroid cancer overdiagnosis, we aim to develop a natural language processing (NLP) algorithm to determine the appropriateness of thyroid ultrasounds (TUS).
Patients and Methods
Between 2017 and 2021, we identified 18,000 TUS patients at Mayo Clinic and selected 628 for chart review to create a ground truth dataset based on consensus. We developed a rule-based NLP pipeline to identify TUS as appropriate TUS (aTUS) or inappropriate TUS (iTUS) using patients’ clinical notes and additional meta information. In addition, we designed an abbreviated NLP pipeline (aNLP) solely focusing on labels from TUS order requisitions to facilitate deployment at other health care systems. Our dataset was split into a training set of 468 (75%) and a test set of 160 (25%), using the former for rule development and the latter for performance evaluation.
Results
There were 449 (95.9%) patients identified as aTUS and 19 (4.06%) as iTUS in the training set; there are 155 (96.88%) patients identified as aTUS and 5 (3.12%) were iTUS in the test set. In the training set, the pipeline achieved a sensitivity of 0.99, specificity of 0.95, and positive predictive value of 1.0 for detecting aTUS. The testing cohort revealed a sensitivity of 0.96, specificity of 0.80, and positive predictive value of 0.99. Similar performance metrics were observed in the aNLP pipeline.
Conclusion
The NLP models can accurately identify the appropriateness of a thyroid ultrasound from clinical documentation and order requisition information, a critical initial step toward evaluating the drivers and outcomes of TUS use and subsequent thyroid cancer overdiagnosis.
{"title":"Thyroid Ultrasound Appropriateness Identification Through Natural Language Processing of Electronic Health Records","authors":"Cristian Soto Jacome MD , Danny Segura Torres MD , Jungwei W. Fan PhD , Ricardo Loor-Torres MD , Mayra Duran MD , Misk Al Zahidy MS , Esteban Cabezas MD , Mariana Borras-Osorio MD , David Toro-Tobon MD , Yuqi Wu PhD , Yonghui Wu PhD , Naykky Singh Ospina MD, MS , Juan P. Brito MD, MS","doi":"10.1016/j.mcpdig.2024.01.001","DOIUrl":"https://doi.org/10.1016/j.mcpdig.2024.01.001","url":null,"abstract":"<div><h3>Objective</h3><p>To address thyroid cancer overdiagnosis, we aim to develop a natural language processing (NLP) algorithm to determine the appropriateness of thyroid ultrasounds (TUS).</p></div><div><h3>Patients and Methods</h3><p>Between 2017 and 2021, we identified 18,000 TUS patients at Mayo Clinic and selected 628 for chart review to create a ground truth dataset based on consensus. We developed a rule-based NLP pipeline to identify TUS as appropriate TUS (aTUS) or inappropriate TUS (iTUS) using patients’ clinical notes and additional meta information. In addition, we designed an abbreviated NLP pipeline (aNLP) solely focusing on labels from TUS order requisitions to facilitate deployment at other health care systems. Our dataset was split into a training set of 468 (75%) and a test set of 160 (25%), using the former for rule development and the latter for performance evaluation.</p></div><div><h3>Results</h3><p>There were 449 (95.9%) patients identified as aTUS and 19 (4.06%) as iTUS in the training set; there are 155 (96.88%) patients identified as aTUS and 5 (3.12%) were iTUS in the test set. In the training set, the pipeline achieved a sensitivity of 0.99, specificity of 0.95, and positive predictive value of 1.0 for detecting aTUS. The testing cohort revealed a sensitivity of 0.96, specificity of 0.80, and positive predictive value of 0.99. Similar performance metrics were observed in the aNLP pipeline.</p></div><div><h3>Conclusion</h3><p>The NLP models can accurately identify the appropriateness of a thyroid ultrasound from clinical documentation and order requisition information, a critical initial step toward evaluating the drivers and outcomes of TUS use and subsequent thyroid cancer overdiagnosis.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 67-74"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000014/pdfft?md5=b25e9a7547bfbd148935d7e81234eadb&pid=1-s2.0-S2949761224000014-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139674437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}