Pub Date : 2024-10-29eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000456
Alexander D VanHelene, Ishaani Khatri, C Beau Hilton, Sanjay Mishra, Ece D Gamsiz Uzun, Jeremy L Warner
Meta-researchers commonly leverage tools that infer gender from first names, especially when studying gender disparities. However, tools vary in their accuracy, ease of use, and cost. The objective of this study was to compare the accuracy and cost of the commercial software Genderize and Gender API, and the open-source gender R package. Differences in binary gender prediction accuracy between the three services were evaluated. Gender prediction accuracy was tested on a multi-national dataset of 32,968 gender-labeled clinical trial authors. Additionally, two datasets from previous studies with 5779 and 6131 names, respectively, were re-evaluated with modern implementations of Genderize and Gender API. The gender inference accuracy of Genderize and Gender API were compared, both with and without supplying trialists' country of origin in the API call. The accuracy of the gender R package was only evaluated without supplying countries of origin. The accuracy of Genderize, Gender API, and the gender R package were defined as the percentage of correct gender predictions. Accuracy differences between methods were evaluated using McNemar's test. Genderize and Gender API demonstrated 96.6% and 96.1% accuracy, respectively, when countries of origin were not supplied in the API calls. Genderize and Gender API achieved the highest accuracy when predicting the gender of German authors with accuracies greater than 98%. Genderize and Gender API were least accurate with South Korean, Chinese, Singaporean, and Taiwanese authors, demonstrating below 82% accuracy. Genderize can provide similar accuracy to Gender API while being 4.85x less expensive. The gender R package achieved below 86% accuracy on the full dataset. In the replication studies, Genderize and gender API demonstrated better performance than in the original publications. Our results indicate that Genderize and Gender API achieve similar accuracy on a multinational dataset. The gender R package is uniformly less accurate than Genderize and Gender API.
元研究人员通常会利用从名字推断性别的工具,尤其是在研究性别差异时。然而,这些工具在准确性、易用性和成本方面各不相同。本研究旨在比较商业软件 Genderize 和 Gender API 以及开源性别 R 软件包的准确性和成本。评估了三种服务在二元性别预测准确性方面的差异。性别预测准确性在一个包含 32968 名性别标签临床试验作者的多国数据集上进行了测试。此外,还使用 Genderize 和 Gender API 的现代实现方法重新评估了以前研究中的两个数据集,这两个数据集分别包含 5779 和 6131 个名字。在 API 调用中提供和不提供试验者原籍国的情况下,对 Genderize 和 Gender API 的性别推断准确性进行了比较。仅在不提供原籍国的情况下评估了性别 R 软件包的准确性。Genderize、Gender API 和性别 R 软件包的准确性被定义为性别预测正确率。使用 McNemar 检验评估了不同方法之间的准确性差异。当在 API 调用中不提供原籍国时,Genderize 和 Gender API 的准确率分别为 96.6% 和 96.1%。当预测德国作者的性别时,Genderize 和 Gender API 的准确率最高,准确率超过 98%。在预测韩国、中国、新加坡和台湾作者的性别时,Genderize 和 Gender API 的准确率最低,准确率低于 82%。Genderize 可以提供与 Gender API 相似的准确率,而成本却低 4.85 倍。性别 R 软件包在全部数据集上的准确率低于 86%。在复制研究中,Genderize 和 Gender API 的表现优于原始出版物。我们的结果表明,Genderize 和 Gender API 在多国数据集上达到了相似的准确率。性别 R 软件包的准确性一律低于 Genderize 和 Gender API。
{"title":"Inferring gender from first names: Comparing the accuracy of Genderize, Gender API, and the gender R package on authors of diverse nationality.","authors":"Alexander D VanHelene, Ishaani Khatri, C Beau Hilton, Sanjay Mishra, Ece D Gamsiz Uzun, Jeremy L Warner","doi":"10.1371/journal.pdig.0000456","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000456","url":null,"abstract":"<p><p>Meta-researchers commonly leverage tools that infer gender from first names, especially when studying gender disparities. However, tools vary in their accuracy, ease of use, and cost. The objective of this study was to compare the accuracy and cost of the commercial software Genderize and Gender API, and the open-source gender R package. Differences in binary gender prediction accuracy between the three services were evaluated. Gender prediction accuracy was tested on a multi-national dataset of 32,968 gender-labeled clinical trial authors. Additionally, two datasets from previous studies with 5779 and 6131 names, respectively, were re-evaluated with modern implementations of Genderize and Gender API. The gender inference accuracy of Genderize and Gender API were compared, both with and without supplying trialists' country of origin in the API call. The accuracy of the gender R package was only evaluated without supplying countries of origin. The accuracy of Genderize, Gender API, and the gender R package were defined as the percentage of correct gender predictions. Accuracy differences between methods were evaluated using McNemar's test. Genderize and Gender API demonstrated 96.6% and 96.1% accuracy, respectively, when countries of origin were not supplied in the API calls. Genderize and Gender API achieved the highest accuracy when predicting the gender of German authors with accuracies greater than 98%. Genderize and Gender API were least accurate with South Korean, Chinese, Singaporean, and Taiwanese authors, demonstrating below 82% accuracy. Genderize can provide similar accuracy to Gender API while being 4.85x less expensive. The gender R package achieved below 86% accuracy on the full dataset. In the replication studies, Genderize and gender API demonstrated better performance than in the original publications. Our results indicate that Genderize and Gender API achieve similar accuracy on a multinational dataset. The gender R package is uniformly less accurate than Genderize and Gender API.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000456"},"PeriodicalIF":0.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11521266/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142549415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-25eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000636
Nowell M Fine, Sunil V Kalmady, Weijie Sun, Russ Greiner, Jonathan G Howlett, James A White, Finlay A McAlister, Justin A Ezekowitz, Padma Kaul
Aims: Patients visiting the emergency department (ED) or hospitalized for heart failure (HF) are at increased risk for subsequent adverse outcomes, however effective risk stratification remains challenging. We utilized a machine-learning (ML)-based approach to identify HF patients at risk of adverse outcomes after an ED visit or hospitalization using a large regional administrative healthcare data system.
Methods and results: Patients visiting the ED or hospitalized with HF between 2002-2016 in Alberta, Canada were included. Outcomes of interest were 30-day and 1-year HF-related ED visits, HF hospital readmission or all-cause mortality. We applied a feature extraction method using deep feature synthesis from multiple sources of health data and compared performance of a gradient boosting algorithm (CatBoost) with logistic regression modelling. The area under receiver operating characteristic curve (AUC-ROC) was used to assess model performance. We included 50,630 patients with 93,552 HF ED visits/hospitalizations. At 30-day follow-up in the holdout validation cohort, the AUC-ROC for the combined endpoint of HF ED visit, HF hospital readmission or death for the Catboost and logistic regression models was 74.16 (73.18-75.11) versus 62.25 (61.25-63.18), respectively. At 1-year follow-up corresponding values were 76.80 (76.1-77.47) versus 69.52 (68.77-70.26), respectively. AUC-ROC values for the endpoint of all-cause death alone at 30-days and 1-year follow-up were 83.21 (81.83-84.41) versus 69.53 (67.98-71.18), and 85.73 (85.14-86.29) versus 69.40 (68.57-70.26), for the CatBoost and logistic regression models, respectively.
Conclusions: ML-based modelling with deep feature synthesis provided superior risk stratification for HF patients at 30-days and 1-year follow-up after an ED visit or hospitalization using data from a large administrative regional healthcare system.
{"title":"Machine Learning For Risk Prediction After Heart Failure Emergency Department Visit or Hospital Admission Using Administrative Health Data.","authors":"Nowell M Fine, Sunil V Kalmady, Weijie Sun, Russ Greiner, Jonathan G Howlett, James A White, Finlay A McAlister, Justin A Ezekowitz, Padma Kaul","doi":"10.1371/journal.pdig.0000636","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000636","url":null,"abstract":"<p><strong>Aims: </strong>Patients visiting the emergency department (ED) or hospitalized for heart failure (HF) are at increased risk for subsequent adverse outcomes, however effective risk stratification remains challenging. We utilized a machine-learning (ML)-based approach to identify HF patients at risk of adverse outcomes after an ED visit or hospitalization using a large regional administrative healthcare data system.</p><p><strong>Methods and results: </strong>Patients visiting the ED or hospitalized with HF between 2002-2016 in Alberta, Canada were included. Outcomes of interest were 30-day and 1-year HF-related ED visits, HF hospital readmission or all-cause mortality. We applied a feature extraction method using deep feature synthesis from multiple sources of health data and compared performance of a gradient boosting algorithm (CatBoost) with logistic regression modelling. The area under receiver operating characteristic curve (AUC-ROC) was used to assess model performance. We included 50,630 patients with 93,552 HF ED visits/hospitalizations. At 30-day follow-up in the holdout validation cohort, the AUC-ROC for the combined endpoint of HF ED visit, HF hospital readmission or death for the Catboost and logistic regression models was 74.16 (73.18-75.11) versus 62.25 (61.25-63.18), respectively. At 1-year follow-up corresponding values were 76.80 (76.1-77.47) versus 69.52 (68.77-70.26), respectively. AUC-ROC values for the endpoint of all-cause death alone at 30-days and 1-year follow-up were 83.21 (81.83-84.41) versus 69.53 (67.98-71.18), and 85.73 (85.14-86.29) versus 69.40 (68.57-70.26), for the CatBoost and logistic regression models, respectively.</p><p><strong>Conclusions: </strong>ML-based modelling with deep feature synthesis provided superior risk stratification for HF patients at 30-days and 1-year follow-up after an ED visit or hospitalization using data from a large administrative regional healthcare system.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000636"},"PeriodicalIF":0.0,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11508085/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-25eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000638
Hind Mohamed, Esme Kittle, Nehal Nour, Ruba Hamed, Kaylem Feeney, Jon Salsberg, Dervla Kelly
Health information on the Internet has a ubiquitous influence on health consumers' behaviour. Searching and evaluating online health information poses a real challenge for many health consumers. To our knowledge, our systematic review paper is the first to explore the interventions targeting lay people to improve their e-health literacy skills. Our paper aims to explore interventions to improve laypeople ability to identify trustworthy online health information. The search was conducted on Ovid Medline, Embase, Cochrane database, Academic Search Complete, and APA psych info. Publications were selected by screening title, abstract, and full text, then manual review of reference lists of selected publications. Data was extracted from eligible studies on an excel sheet about the types of interventions, the outcomes of the interventions and whether they are effective, and the barriers and facilitators for using the interventions by consumers. A mixed-methods appraisal tool was used to appraise evidence from quantitative, qualitative, and mixed-methods studies. Whittemore and Knafl's integrative review approach was used as a guidance for narrative synthesis. The total number of included studies is twelve. Media literacy interventions are the most common type of interventions. Few studies measured the effect of the interventions on patient health outcomes. All the procedural and navigation/ evaluation skills-building interventions are significantly effective. Computer/internet illiteracy and the absence of guidance/facilitators are significant barriers to web-based intervention use. Few interventions are distinguished by its implementation in a context tailored to consumers, using a human-centred design approach, and delivery through multiple health stakeholders' partnership. There is potential for further research to understand how to improve consumers health information use focusing on collaborative learning, using human-centred approaches, and addressing the social determinants of health.
互联网上的健康信息对健康消费者的行为有着无处不在的影响。对许多健康消费者来说,搜索和评估在线健康信息是一项真正的挑战。据我们所知,我们的系统综述论文是第一篇探讨针对非专业人士的干预措施,以提高他们的电子健康知识技能的论文。我们的论文旨在探讨提高非专业人士识别可信在线健康信息能力的干预措施。我们在 Ovid Medline、Embase、Cochrane 数据库、Academic Search Complete 和 APA psych info 上进行了检索。通过筛选标题、摘要和全文,然后对所选出版物的参考文献列表进行人工审阅。通过 excel 表从符合条件的研究中提取数据,内容包括干预措施的类型、干预措施的结果和是否有效,以及消费者使用干预措施的障碍和促进因素。采用混合方法评估工具对定量、定性和混合方法研究的证据进行评估。Whittemore和Knafl的综合综述法被用作叙事综合的指导。共纳入 12 项研究。媒体素养干预是最常见的干预类型。很少有研究测量了干预措施对患者健康结果的影响。所有程序性干预和导航/评估技能培养干预都非常有效。计算机/互联网文盲和缺乏指导/协助者是使用网络干预的主要障碍。很少有干预措施能够在为消费者量身定制的环境中实施,采用以人为本的设计方法,并通过多个健康利益相关者的合作来实施。我们有潜力开展进一步的研究,以了解如何改善消费者对健康信息的使用,重点是协作学习、使用以人为本的方法以及解决健康的社会决定因素。
{"title":"An integrative systematic review on interventions to improve layperson's ability to identify trustworthy digital health information.","authors":"Hind Mohamed, Esme Kittle, Nehal Nour, Ruba Hamed, Kaylem Feeney, Jon Salsberg, Dervla Kelly","doi":"10.1371/journal.pdig.0000638","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000638","url":null,"abstract":"<p><p>Health information on the Internet has a ubiquitous influence on health consumers' behaviour. Searching and evaluating online health information poses a real challenge for many health consumers. To our knowledge, our systematic review paper is the first to explore the interventions targeting lay people to improve their e-health literacy skills. Our paper aims to explore interventions to improve laypeople ability to identify trustworthy online health information. The search was conducted on Ovid Medline, Embase, Cochrane database, Academic Search Complete, and APA psych info. Publications were selected by screening title, abstract, and full text, then manual review of reference lists of selected publications. Data was extracted from eligible studies on an excel sheet about the types of interventions, the outcomes of the interventions and whether they are effective, and the barriers and facilitators for using the interventions by consumers. A mixed-methods appraisal tool was used to appraise evidence from quantitative, qualitative, and mixed-methods studies. Whittemore and Knafl's integrative review approach was used as a guidance for narrative synthesis. The total number of included studies is twelve. Media literacy interventions are the most common type of interventions. Few studies measured the effect of the interventions on patient health outcomes. All the procedural and navigation/ evaluation skills-building interventions are significantly effective. Computer/internet illiteracy and the absence of guidance/facilitators are significant barriers to web-based intervention use. Few interventions are distinguished by its implementation in a context tailored to consumers, using a human-centred design approach, and delivery through multiple health stakeholders' partnership. There is potential for further research to understand how to improve consumers health information use focusing on collaborative learning, using human-centred approaches, and addressing the social determinants of health.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000638"},"PeriodicalIF":0.0,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11508166/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Innovative information-sharing techniques and rapid access to stored research data as scientific currency have proved highly beneficial in healthcare and health research. Yet, researchers often experience conflict between data sharing to promote health-related scientific knowledge for the common good and their personal academic advancement. There is a scarcity of studies exploring the perspectives of health researchers in sub-Saharan Africa (SSA) regarding the challenges with data sharing in the context of data-intensive research. The study began with a quantitative survey and research, after which the researchers engaged in a qualitative study. This qualitative cross-sectional baseline study reports on the challenges faced by health researchers, in terms of data sharing. In-depth interviews were conducted via Microsoft Teams between July 2022 and April 2023 with 16 health researchers from 16 different countries across SSA. We employed purposive and snowballing sampling techniques to invite participants via email. The recorded interviews were transcribed, coded and analysed thematically using ATLAS.ti. Five recurrent themes and several subthemes emerged related to (1) individual researcher concerns (fears regarding data sharing, publication and manuscript pressure), (2) structural issues impacting data sharing, (3) recognition in academia (scooping of research data, acknowledgement and research incentives) (4) ethical challenges experienced by health researchers in SSA (confidentiality and informed consent, commercialisation and benefit sharing) and (5) legal lacunae (gaps in laws and regulations). Significant discomfort about data sharing exists amongst health researchers in this sample of respondents from SSA, resulting in a reluctance to share data despite acknowledging the scientific benefits of such sharing. This discomfort is related to the lack of adequate guidelines and governance processes in the context of health research collaborations, both locally and internationally. Consequently, concerns about ethical and legal issues are increasing. Resources are needed in SSA to improve the quality, value and veracity of data-as these are ethical imperatives. Strengthening data governance via robust guidelines, legislation and appropriate data sharing agreements will increase trust amongst health researchers and data donors alike.
{"title":"Data as scientific currency: Challenges experienced by researchers with sharing health data in sub-Saharan Africa.","authors":"Jyothi Chabilall, Qunita Brown, Nezerith Cengiz, Keymanthri Moodley","doi":"10.1371/journal.pdig.0000635","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000635","url":null,"abstract":"<p><p>Innovative information-sharing techniques and rapid access to stored research data as scientific currency have proved highly beneficial in healthcare and health research. Yet, researchers often experience conflict between data sharing to promote health-related scientific knowledge for the common good and their personal academic advancement. There is a scarcity of studies exploring the perspectives of health researchers in sub-Saharan Africa (SSA) regarding the challenges with data sharing in the context of data-intensive research. The study began with a quantitative survey and research, after which the researchers engaged in a qualitative study. This qualitative cross-sectional baseline study reports on the challenges faced by health researchers, in terms of data sharing. In-depth interviews were conducted via Microsoft Teams between July 2022 and April 2023 with 16 health researchers from 16 different countries across SSA. We employed purposive and snowballing sampling techniques to invite participants via email. The recorded interviews were transcribed, coded and analysed thematically using ATLAS.ti. Five recurrent themes and several subthemes emerged related to (1) individual researcher concerns (fears regarding data sharing, publication and manuscript pressure), (2) structural issues impacting data sharing, (3) recognition in academia (scooping of research data, acknowledgement and research incentives) (4) ethical challenges experienced by health researchers in SSA (confidentiality and informed consent, commercialisation and benefit sharing) and (5) legal lacunae (gaps in laws and regulations). Significant discomfort about data sharing exists amongst health researchers in this sample of respondents from SSA, resulting in a reluctance to share data despite acknowledging the scientific benefits of such sharing. This discomfort is related to the lack of adequate guidelines and governance processes in the context of health research collaborations, both locally and internationally. Consequently, concerns about ethical and legal issues are increasing. Resources are needed in SSA to improve the quality, value and veracity of data-as these are ethical imperatives. Strengthening data governance via robust guidelines, legislation and appropriate data sharing agreements will increase trust amongst health researchers and data donors alike.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000635"},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11500889/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-24eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000465
Sooin Lee, Bryce Benson, Ashwin Belle, Richard P Medlin, David Jerkins, Foster Goss, Ashish K Khanna, Michael A DeVita, Kevin R Ward
Identifying the onset of patient deterioration is challenging despite the potential to respond to patients earlier with better vital sign monitoring and rapid response team (RRT) activation. In this study an ECG based software as a medical device, the Analytic for Hemodynamic Instability Predictive Index (AHI-PI), was compared to the vital signs of heart rate, blood pressure, and respiratory rate, evaluating how early it indicated risk before an RRT activation. A higher proportion of the events had risk indication by AHI-PI (92.71%) than by vital signs (41.67%). AHI-PI indicated risk early, with an average of over a day before RRT events. In events whose risks were indicated by both AHI-PI and vital signs, AHI-PI demonstrated earlier recognition of deterioration compared to vital signs. A case-control study showed that situations requiring RRTs were more likely to have AHI-PI risk indication than those that did not. The study derived several insights in support of AHI-PI's efficacy as a clinical decision support system. The findings demonstrated AHI-PI's potential to serve as a reliable predictor of future RRT events. It could potentially help clinicians recognize early clinical deterioration and respond to those unnoticed by vital signs, thereby helping clinicians improve clinical outcomes.
{"title":"Use of a continuous single lead electrocardiogram analytic to predict patient deterioration requiring rapid response team activation.","authors":"Sooin Lee, Bryce Benson, Ashwin Belle, Richard P Medlin, David Jerkins, Foster Goss, Ashish K Khanna, Michael A DeVita, Kevin R Ward","doi":"10.1371/journal.pdig.0000465","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000465","url":null,"abstract":"<p><p>Identifying the onset of patient deterioration is challenging despite the potential to respond to patients earlier with better vital sign monitoring and rapid response team (RRT) activation. In this study an ECG based software as a medical device, the Analytic for Hemodynamic Instability Predictive Index (AHI-PI), was compared to the vital signs of heart rate, blood pressure, and respiratory rate, evaluating how early it indicated risk before an RRT activation. A higher proportion of the events had risk indication by AHI-PI (92.71%) than by vital signs (41.67%). AHI-PI indicated risk early, with an average of over a day before RRT events. In events whose risks were indicated by both AHI-PI and vital signs, AHI-PI demonstrated earlier recognition of deterioration compared to vital signs. A case-control study showed that situations requiring RRTs were more likely to have AHI-PI risk indication than those that did not. The study derived several insights in support of AHI-PI's efficacy as a clinical decision support system. The findings demonstrated AHI-PI's potential to serve as a reliable predictor of future RRT events. It could potentially help clinicians recognize early clinical deterioration and respond to those unnoticed by vital signs, thereby helping clinicians improve clinical outcomes.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000465"},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11500862/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000642
Elizabeth A Campbell, Saurav Bose, Aaron J Masino
Electronic Health Records (EHRs) are increasingly used to develop machine learning models in predictive medicine. There has been limited research on utilizing machine learning methods to predict childhood obesity and related disparities in classifier performance among vulnerable patient subpopulations. In this work, classification models are developed to recognize pediatric obesity using temporal condition patterns obtained from patient EHR data in a U.S. study population. We trained four machine learning algorithms (Logistic Regression, Random Forest, Gradient Boosted Trees, and Neural Networks) to classify cases and controls as obesity positive or negative, and optimized hyperparameter settings through a bootstrapping methodology. To assess the classifiers for bias, we studied model performance by population subgroups then used permutation analysis to identify the most predictive features for each model and the demographic characteristics of patients with these features. Mean AUC-ROC values were consistent across classifiers, ranging from 0.72-0.80. Some evidence of bias was identified, although this was through the models performing better for minority subgroups (African Americans and patients enrolled in Medicaid). Permutation analysis revealed that patients from vulnerable population subgroups were over-represented among patients with the most predictive diagnostic patterns. We hypothesize that our models performed better on under-represented groups because the features more strongly associated with obesity were more commonly observed among minority patients. These findings highlight the complex ways that bias may arise in machine learning models and can be incorporated into future research to develop a thorough analytical approach to identify and mitigate bias that may arise from features and within EHR datasets when developing more equitable models.
{"title":"Conceptualizing bias in EHR data: A case study in performance disparities by demographic subgroups for a pediatric obesity incidence classifier.","authors":"Elizabeth A Campbell, Saurav Bose, Aaron J Masino","doi":"10.1371/journal.pdig.0000642","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000642","url":null,"abstract":"<p><p>Electronic Health Records (EHRs) are increasingly used to develop machine learning models in predictive medicine. There has been limited research on utilizing machine learning methods to predict childhood obesity and related disparities in classifier performance among vulnerable patient subpopulations. In this work, classification models are developed to recognize pediatric obesity using temporal condition patterns obtained from patient EHR data in a U.S. study population. We trained four machine learning algorithms (Logistic Regression, Random Forest, Gradient Boosted Trees, and Neural Networks) to classify cases and controls as obesity positive or negative, and optimized hyperparameter settings through a bootstrapping methodology. To assess the classifiers for bias, we studied model performance by population subgroups then used permutation analysis to identify the most predictive features for each model and the demographic characteristics of patients with these features. Mean AUC-ROC values were consistent across classifiers, ranging from 0.72-0.80. Some evidence of bias was identified, although this was through the models performing better for minority subgroups (African Americans and patients enrolled in Medicaid). Permutation analysis revealed that patients from vulnerable population subgroups were over-represented among patients with the most predictive diagnostic patterns. We hypothesize that our models performed better on under-represented groups because the features more strongly associated with obesity were more commonly observed among minority patients. These findings highlight the complex ways that bias may arise in machine learning models and can be incorporated into future research to develop a thorough analytical approach to identify and mitigate bias that may arise from features and within EHR datasets when developing more equitable models.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000642"},"PeriodicalIF":0.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11498669/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-22eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000633
Eman Metwally, Sarah E Soppe, Jennifer L Lund, Sharon Peacock Hinton, Caroline A Thompson
Background: Investigators often use claims data to estimate the diagnosis timing of chronic conditions. However, misclassification of chronic conditions is common due to variability in healthcare utilization and in claims history across patients.
Objective: We aimed to quantify the effect of various Medicare fee-for-service continuous enrollment period and lookback period (LBP) on misclassification of COPD and sample size.
Methods: A stepwise tutorial to classify COPD, based on its diagnosis timing relative to lung cancer diagnosis using the Surveillance Epidemiology and End Results cancer registry linked to Medicare insurance claims. We used 3 approaches varying the LBP and required continuous enrollment (i.e., observability) period between 1 to 5 years. Patients with lung cancer were classified based on their COPD related healthcare utilization into 3 groups: pre-existing COPD (diagnosis at least 3 months before lung cancer diagnosis), concurrent COPD (diagnosis during the -/+ 3months of lung cancer diagnosis), and non-COPD. Among those with 5 years of continuous enrollment, we estimated the sensitivity of the LBP to ascertain COPD diagnosis as the number of patients with pre-existing COPD using a shorter LBP divided by the number of patients with pre-existing COPD using a longer LBP.
Results: Extending the LBP from 1 to 5 years increased prevalence of pre-existing COPD from ~ 36% to 51%, decreased both concurrent COPD from ~ 34% to 23% and non-COPD from ~ 29% to 25%. There was minimal effect of extending the required continuous enrollment period beyond one year across various LBPs. In those with 5 years of continuous enrollment, sensitivity of COPD classification (95% CI) increased with longer LBP from 70.1% (69.7% to 70.4%) for one-year LBP to 100% for 5-years LBP.
Conclusion: The length of optimum LBP and continuous enrollment period depends on the context of the research question and the data generating mechanisms. Among Medicare beneficiaries, the best approach to identify diagnosis timing of COPD relative to lung cancer diagnosis is to use all available LBP with at least one year of required continuous enrollment.
{"title":"Impact of observability period on the classification of COPD diagnosis timing among Medicare beneficiaries with lung cancer.","authors":"Eman Metwally, Sarah E Soppe, Jennifer L Lund, Sharon Peacock Hinton, Caroline A Thompson","doi":"10.1371/journal.pdig.0000633","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000633","url":null,"abstract":"<p><strong>Background: </strong>Investigators often use claims data to estimate the diagnosis timing of chronic conditions. However, misclassification of chronic conditions is common due to variability in healthcare utilization and in claims history across patients.</p><p><strong>Objective: </strong>We aimed to quantify the effect of various Medicare fee-for-service continuous enrollment period and lookback period (LBP) on misclassification of COPD and sample size.</p><p><strong>Methods: </strong>A stepwise tutorial to classify COPD, based on its diagnosis timing relative to lung cancer diagnosis using the Surveillance Epidemiology and End Results cancer registry linked to Medicare insurance claims. We used 3 approaches varying the LBP and required continuous enrollment (i.e., observability) period between 1 to 5 years. Patients with lung cancer were classified based on their COPD related healthcare utilization into 3 groups: pre-existing COPD (diagnosis at least 3 months before lung cancer diagnosis), concurrent COPD (diagnosis during the -/+ 3months of lung cancer diagnosis), and non-COPD. Among those with 5 years of continuous enrollment, we estimated the sensitivity of the LBP to ascertain COPD diagnosis as the number of patients with pre-existing COPD using a shorter LBP divided by the number of patients with pre-existing COPD using a longer LBP.</p><p><strong>Results: </strong>Extending the LBP from 1 to 5 years increased prevalence of pre-existing COPD from ~ 36% to 51%, decreased both concurrent COPD from ~ 34% to 23% and non-COPD from ~ 29% to 25%. There was minimal effect of extending the required continuous enrollment period beyond one year across various LBPs. In those with 5 years of continuous enrollment, sensitivity of COPD classification (95% CI) increased with longer LBP from 70.1% (69.7% to 70.4%) for one-year LBP to 100% for 5-years LBP.</p><p><strong>Conclusion: </strong>The length of optimum LBP and continuous enrollment period depends on the context of the research question and the data generating mechanisms. Among Medicare beneficiaries, the best approach to identify diagnosis timing of COPD relative to lung cancer diagnosis is to use all available LBP with at least one year of required continuous enrollment.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000633"},"PeriodicalIF":0.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11495636/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000640
Wei Liao, Joel Voldman
Recent work in machine learning for healthcare has raised concerns about patient privacy and algorithmic fairness. Previous work has shown that self-reported race can be predicted from medical data that does not explicitly contain racial information. However, the extent of data identification is unknown, and we lack ways to develop models whose outcomes are minimally affected by such information. Here we systematically investigated the ability of time-series electronic health record data to predict patient static information. We found that not only the raw time-series data, but also learned representations from machine learning models, can be trained to predict a variety of static information with area under the receiver operating characteristic curve as high as 0.851 for biological sex, 0.869 for binarized age and 0.810 for self-reported race. Such high predictive performance can be extended to various comorbidity factors and exists even when the model was trained for different tasks, using different cohorts, using different model architectures and databases. Given the privacy and fairness concerns these findings pose, we develop a variational autoencoder-based approach that learns a structured latent space to disentangle patient-sensitive attributes from time-series data. Our work thoroughly investigates the ability of machine learning models to encode patient static information from time-series electronic health records and introduces a general approach to protect patient-sensitive information for downstream tasks.
{"title":"Learning and diSentangling patient static information from time-series Electronic hEalth Records (STEER).","authors":"Wei Liao, Joel Voldman","doi":"10.1371/journal.pdig.0000640","DOIUrl":"10.1371/journal.pdig.0000640","url":null,"abstract":"<p><p>Recent work in machine learning for healthcare has raised concerns about patient privacy and algorithmic fairness. Previous work has shown that self-reported race can be predicted from medical data that does not explicitly contain racial information. However, the extent of data identification is unknown, and we lack ways to develop models whose outcomes are minimally affected by such information. Here we systematically investigated the ability of time-series electronic health record data to predict patient static information. We found that not only the raw time-series data, but also learned representations from machine learning models, can be trained to predict a variety of static information with area under the receiver operating characteristic curve as high as 0.851 for biological sex, 0.869 for binarized age and 0.810 for self-reported race. Such high predictive performance can be extended to various comorbidity factors and exists even when the model was trained for different tasks, using different cohorts, using different model architectures and databases. Given the privacy and fairness concerns these findings pose, we develop a variational autoencoder-based approach that learns a structured latent space to disentangle patient-sensitive attributes from time-series data. Our work thoroughly investigates the ability of machine learning models to encode patient static information from time-series electronic health records and introduces a general approach to protect patient-sensitive information for downstream tasks.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000640"},"PeriodicalIF":0.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11493250/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-18eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000441
Wenshan Li, Luke Turcotte, Amy T Hsu, Robert Talarico, Danial Qureshi, Colleen Webber, Steven Hawken, Peter Tanuseputro, Douglas G Manuel, Greg Huyer
Objectives: To develop and validate a model to predict time-to-LTC admissions among individuals with dementia.
Design: Population-based retrospective cohort study using health administrative data.
Setting and participants: Community-dwelling older adults (65+) in Ontario living with dementia and assessed with the Resident Assessment Instrument for Home Care (RAI-HC) between April 1, 2010 and March 31, 2017.
Methods: Individuals in the derivation cohort (n = 95,813; assessed before March 31, 2015) were followed for up to 360 days after the index RAI-HC assessment for admission into LTC. We used a multivariable Fine Gray sub-distribution hazard model to predict the cumulative incidence of LTC entry while accounting for all-cause mortality as a competing risk. The model was validated in 34,038 older adults with dementia with an index RAI-HC assessment between April 1, 2015 and March 31, 2017.
Results: Within one year of a RAI-HC assessment, 35,513 (37.1%) individuals in the derivation cohort and 10,735 (31.5%) in the validation cohort entered LTC. Our algorithm was well-calibrated (Emax = 0.119, ICIavg = 0.057) and achieved a c-statistic of 0.707 (95% confidence interval: 0.703-0.712) in the validation cohort.
Conclusions and implications: We developed an algorithm to predict time to LTC entry among individuals living with dementia. This tool can inform care planning for individuals with dementia and their family caregivers.
{"title":"Derivation and validation of an algorithm to predict transitions from community to residential long-term care among persons with dementia-A retrospective cohort study.","authors":"Wenshan Li, Luke Turcotte, Amy T Hsu, Robert Talarico, Danial Qureshi, Colleen Webber, Steven Hawken, Peter Tanuseputro, Douglas G Manuel, Greg Huyer","doi":"10.1371/journal.pdig.0000441","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000441","url":null,"abstract":"<p><strong>Objectives: </strong>To develop and validate a model to predict time-to-LTC admissions among individuals with dementia.</p><p><strong>Design: </strong>Population-based retrospective cohort study using health administrative data.</p><p><strong>Setting and participants: </strong>Community-dwelling older adults (65+) in Ontario living with dementia and assessed with the Resident Assessment Instrument for Home Care (RAI-HC) between April 1, 2010 and March 31, 2017.</p><p><strong>Methods: </strong>Individuals in the derivation cohort (n = 95,813; assessed before March 31, 2015) were followed for up to 360 days after the index RAI-HC assessment for admission into LTC. We used a multivariable Fine Gray sub-distribution hazard model to predict the cumulative incidence of LTC entry while accounting for all-cause mortality as a competing risk. The model was validated in 34,038 older adults with dementia with an index RAI-HC assessment between April 1, 2015 and March 31, 2017.</p><p><strong>Results: </strong>Within one year of a RAI-HC assessment, 35,513 (37.1%) individuals in the derivation cohort and 10,735 (31.5%) in the validation cohort entered LTC. Our algorithm was well-calibrated (Emax = 0.119, ICIavg = 0.057) and achieved a c-statistic of 0.707 (95% confidence interval: 0.703-0.712) in the validation cohort.</p><p><strong>Conclusions and implications: </strong>We developed an algorithm to predict time to LTC entry among individuals living with dementia. This tool can inform care planning for individuals with dementia and their family caregivers.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000441"},"PeriodicalIF":0.0,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11488705/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-16eCollection Date: 2024-10-01DOI: 10.1371/journal.pdig.0000641
Davide Ferrari, Pietro Arina, Jonathan Edgeworth, Vasa Curcin, Veronica Guidetti, Federica Mandreoli, Yanzhong Wang
Nosocomial infections and Antimicrobial Resistance (AMR) stand as formidable healthcare challenges on a global scale. To address these issues, various infection control protocols and personalized treatment strategies, guided by laboratory tests, aim to detect bloodstream infections (BSI) and assess the potential for AMR. In this study, we introduce a machine learning (ML) approach based on Multi-Objective Symbolic Regression (MOSR), an evolutionary approach to create ML models in the form of readable mathematical equations in a multi-objective way to overcome the limitation of standard single-objective approaches. This method leverages readily available clinical data collected upon admission to intensive care units, with the goal of predicting the presence of BSI and AMR. We further assess its performance by comparing it to established ML algorithms using both naturally imbalanced real-world data and data that has been balanced through oversampling techniques. Our findings reveal that traditional ML models exhibit subpar performance across all training scenarios. In contrast, MOSR, specifically configured to minimize false negatives by optimizing also for the F1-Score, outperforms other ML algorithms and consistently delivers reliable results, irrespective of the training set balance with F1-Score.22 and.28 higher than any other alternative. This research signifies a promising path forward in enhancing Antimicrobial Stewardship (AMS) strategies. Notably, the MOSR approach can be readily implemented on a large scale, offering a new ML tool to find solutions to these critical healthcare issues affected by limited data availability.
非医院感染和抗菌药物耐药性(AMR)是全球范围内医疗保健领域面临的严峻挑战。为了解决这些问题,在实验室检测的指导下,各种感染控制协议和个性化治疗策略旨在检测血流感染(BSI)并评估 AMR 的可能性。在本研究中,我们介绍了一种基于多目标符号回归(MOSR)的机器学习(ML)方法,这是一种以多目标方式创建可读数学方程形式的 ML 模型的进化方法,克服了标准单目标方法的局限性。这种方法利用了重症监护病房入院时收集的现成临床数据,目的是预测是否存在 BSI 和 AMR。我们使用自然失衡的真实世界数据和通过超采样技术实现平衡的数据,将其与成熟的 ML 算法进行比较,从而进一步评估其性能。我们的研究结果表明,传统的 ML 模型在所有训练场景中都表现不佳。与此相反,MOSR 通过对 F1 分数进行优化,将假阴性降到最低,其性能优于其他 ML 算法,无论训练集平衡与否,都能持续提供可靠的结果,其 F1 分数分别比其他任何算法高出 22 分和 28 分。这项研究为加强抗菌药物管理(AMS)战略开辟了一条充满希望的道路。值得注意的是,MOSR 方法可以很容易地大规模实施,它提供了一种新的 ML 工具,可以为这些受有限数据可用性影响的关键医疗保健问题找到解决方案。
{"title":"Using interpretable machine learning to predict bloodstream infection and antimicrobial resistance in patients admitted to ICU: Early alert predictors based on EHR data to guide antimicrobial stewardship.","authors":"Davide Ferrari, Pietro Arina, Jonathan Edgeworth, Vasa Curcin, Veronica Guidetti, Federica Mandreoli, Yanzhong Wang","doi":"10.1371/journal.pdig.0000641","DOIUrl":"https://doi.org/10.1371/journal.pdig.0000641","url":null,"abstract":"<p><p>Nosocomial infections and Antimicrobial Resistance (AMR) stand as formidable healthcare challenges on a global scale. To address these issues, various infection control protocols and personalized treatment strategies, guided by laboratory tests, aim to detect bloodstream infections (BSI) and assess the potential for AMR. In this study, we introduce a machine learning (ML) approach based on Multi-Objective Symbolic Regression (MOSR), an evolutionary approach to create ML models in the form of readable mathematical equations in a multi-objective way to overcome the limitation of standard single-objective approaches. This method leverages readily available clinical data collected upon admission to intensive care units, with the goal of predicting the presence of BSI and AMR. We further assess its performance by comparing it to established ML algorithms using both naturally imbalanced real-world data and data that has been balanced through oversampling techniques. Our findings reveal that traditional ML models exhibit subpar performance across all training scenarios. In contrast, MOSR, specifically configured to minimize false negatives by optimizing also for the F1-Score, outperforms other ML algorithms and consistently delivers reliable results, irrespective of the training set balance with F1-Score.22 and.28 higher than any other alternative. This research signifies a promising path forward in enhancing Antimicrobial Stewardship (AMS) strategies. Notably, the MOSR approach can be readily implemented on a large scale, offering a new ML tool to find solutions to these critical healthcare issues affected by limited data availability.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"3 10","pages":"e0000641"},"PeriodicalIF":0.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11482717/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}