Pub Date : 2024-07-29DOI: 10.1186/s12911-024-02619-8
Derya Dursun, Rumeysa Bilici Geçer
Background: To evaluate the accuracy, reliability, quality, and readability of responses generated by ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot in relation to orthodontic clear aligners.
Methods: Frequently asked questions by patients/laypersons about clear aligners on websites were identified using the Google search tool and these questions were posed to ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot AI models. Responses were assessed using a five-point Likert scale for accuracy, the modified DISCERN scale for reliability, the Global Quality Scale (GQS) for quality, and the Flesch Reading Ease Score (FRES) for readability.
Results: ChatGPT-4 responses had the highest mean Likert score (4.5 ± 0.61), followed by Copilot (4.35 ± 0.81), ChatGPT-3.5 (4.15 ± 0.75) and Gemini (4.1 ± 0.72). The difference between the Likert scores of the chatbot models was not statistically significant (p > 0.05). Copilot had a significantly higher modified DISCERN and GQS score compared to both Gemini, ChatGPT-4 and ChatGPT-3.5 (p < 0.05). Gemini's modified DISCERN and GQS score was statistically higher than ChatGPT-3.5 (p < 0.05). Gemini also had a significantly higher FRES compared to both ChatGPT-4, Copilot and ChatGPT-3.5 (p < 0.05). The mean FRES was 38.39 ± 11.56 for ChatGPT-3.5, 43.88 ± 10.13 for ChatGPT-4 and 41.72 ± 10.74 for Copilot, indicating that the responses were difficult to read according to the reading level. The mean FRES for Gemini is 54.12 ± 10.27, indicating that Gemini's responses are more readable than other chatbots.
Conclusions: All chatbot models provided generally accurate, moderate reliable and moderate to good quality answers to questions about the clear aligners. Furthermore, the readability of the responses was difficult. ChatGPT, Gemini and Copilot have significant potential as patient information tools in orthodontics, however, to be fully effective they need to be supplemented with more evidence-based information and improved readability.
{"title":"Can artificial intelligence models serve as patient information consultants in orthodontics?","authors":"Derya Dursun, Rumeysa Bilici Geçer","doi":"10.1186/s12911-024-02619-8","DOIUrl":"10.1186/s12911-024-02619-8","url":null,"abstract":"<p><strong>Background: </strong>To evaluate the accuracy, reliability, quality, and readability of responses generated by ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot in relation to orthodontic clear aligners.</p><p><strong>Methods: </strong>Frequently asked questions by patients/laypersons about clear aligners on websites were identified using the Google search tool and these questions were posed to ChatGPT-3.5, ChatGPT-4, Gemini, and Copilot AI models. Responses were assessed using a five-point Likert scale for accuracy, the modified DISCERN scale for reliability, the Global Quality Scale (GQS) for quality, and the Flesch Reading Ease Score (FRES) for readability.</p><p><strong>Results: </strong>ChatGPT-4 responses had the highest mean Likert score (4.5 ± 0.61), followed by Copilot (4.35 ± 0.81), ChatGPT-3.5 (4.15 ± 0.75) and Gemini (4.1 ± 0.72). The difference between the Likert scores of the chatbot models was not statistically significant (p > 0.05). Copilot had a significantly higher modified DISCERN and GQS score compared to both Gemini, ChatGPT-4 and ChatGPT-3.5 (p < 0.05). Gemini's modified DISCERN and GQS score was statistically higher than ChatGPT-3.5 (p < 0.05). Gemini also had a significantly higher FRES compared to both ChatGPT-4, Copilot and ChatGPT-3.5 (p < 0.05). The mean FRES was 38.39 ± 11.56 for ChatGPT-3.5, 43.88 ± 10.13 for ChatGPT-4 and 41.72 ± 10.74 for Copilot, indicating that the responses were difficult to read according to the reading level. The mean FRES for Gemini is 54.12 ± 10.27, indicating that Gemini's responses are more readable than other chatbots.</p><p><strong>Conclusions: </strong>All chatbot models provided generally accurate, moderate reliable and moderate to good quality answers to questions about the clear aligners. Furthermore, the readability of the responses was difficult. ChatGPT, Gemini and Copilot have significant potential as patient information tools in orthodontics, however, to be fully effective they need to be supplemented with more evidence-based information and improved readability.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285120/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141792010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-29DOI: 10.1186/s12911-024-02618-9
Mike Nsubuga, Henry Mutegeki, Daudi Jjingo, Deogratias Munube, Ruth Namazzi, Robert Opoka, Philip Kasirye, Grace Ndeezi, Heather Hume, Ezekiel Mupere, Grace Kebirungi, Isaac Birungi, Jack Morrice, Mario Jonas, Victoria Nembaware, Ambroise Wonkam, Julie Makani, Sarah Kiguli
Background: Sub-Saharan Africa bears the highest burden of sickle cell disease (SCD) globally with Nigeria, Democratic Republic of Congo, Tanzania, Uganda being the most affected countries. Uganda reports approximately 20,000 SCD births annually, constituting 6.67% of reported global SCD births. Despite this, there is a paucity of comprehensive data on SCD from the African continent. SCD registries offer a promising avenue for conducting prospective studies, elucidating disease severity patterns, and evaluating the intricate interplay of social, environmental, and genetic factors. This paper describes the establishment of the Sickle Pan Africa Research Consortium (SPARCo) Uganda registry, encompassing its design, development, data collection, and key insights learned, aligning with collaborative efforts in Nigeria, Tanzania, and Ghana SPARCo registries.
Methods: The registry was created using pre-existing case report forms harmonized from the SPARCo data dictionary and ontology to fit Uganda clinical needs. The case report forms were developed with SCD data elements of interest including demographics, consent, baseline, clinical, laboratory and others. That data was then parsed into a customized REDCap database, configured to suit the optimized ontologies and support retrieval aggregations and analyses. Patients were enrolled from one national referral and three regional referral hospitals in Uganda.
Results: A nationwide electronic patient-consented registry for SCD was established from four regional hospitals. A total of 5,655 patients were enrolled from Mulago National Referral Hospital (58%), Jinja Regional Referral (14.4%), Mbale Regional Referral (16.9%), and Lira Regional Referral (10.7%) hospitals between June 2022 and October 2023.
Conclusion: Uganda has been able to develop a SCD registry consistent with data from Tanzania, Nigeria and Ghana. Our findings demonstrate that it's feasible to develop longitudinal SCD registries in sub-Saharan Africa. These registries will be crucial for facilitating a range of studies, including the analysis of SCD clinical phenotypes and patient outcomes, newborn screening, and evaluation of hydroxyurea use, among others. This initiative underscores the potential for developing comprehensive disease registries in resource-limited settings, fostering collaborative, data-driven research efforts aimed at addressing the multifaceted challenges of SCD in Africa.
{"title":"The Ugandan sickle Pan-African research consortium registry: design, development, and lessons.","authors":"Mike Nsubuga, Henry Mutegeki, Daudi Jjingo, Deogratias Munube, Ruth Namazzi, Robert Opoka, Philip Kasirye, Grace Ndeezi, Heather Hume, Ezekiel Mupere, Grace Kebirungi, Isaac Birungi, Jack Morrice, Mario Jonas, Victoria Nembaware, Ambroise Wonkam, Julie Makani, Sarah Kiguli","doi":"10.1186/s12911-024-02618-9","DOIUrl":"10.1186/s12911-024-02618-9","url":null,"abstract":"<p><strong>Background: </strong>Sub-Saharan Africa bears the highest burden of sickle cell disease (SCD) globally with Nigeria, Democratic Republic of Congo, Tanzania, Uganda being the most affected countries. Uganda reports approximately 20,000 SCD births annually, constituting 6.67% of reported global SCD births. Despite this, there is a paucity of comprehensive data on SCD from the African continent. SCD registries offer a promising avenue for conducting prospective studies, elucidating disease severity patterns, and evaluating the intricate interplay of social, environmental, and genetic factors. This paper describes the establishment of the Sickle Pan Africa Research Consortium (SPARCo) Uganda registry, encompassing its design, development, data collection, and key insights learned, aligning with collaborative efforts in Nigeria, Tanzania, and Ghana SPARCo registries.</p><p><strong>Methods: </strong>The registry was created using pre-existing case report forms harmonized from the SPARCo data dictionary and ontology to fit Uganda clinical needs. The case report forms were developed with SCD data elements of interest including demographics, consent, baseline, clinical, laboratory and others. That data was then parsed into a customized REDCap database, configured to suit the optimized ontologies and support retrieval aggregations and analyses. Patients were enrolled from one national referral and three regional referral hospitals in Uganda.</p><p><strong>Results: </strong>A nationwide electronic patient-consented registry for SCD was established from four regional hospitals. A total of 5,655 patients were enrolled from Mulago National Referral Hospital (58%), Jinja Regional Referral (14.4%), Mbale Regional Referral (16.9%), and Lira Regional Referral (10.7%) hospitals between June 2022 and October 2023.</p><p><strong>Conclusion: </strong>Uganda has been able to develop a SCD registry consistent with data from Tanzania, Nigeria and Ghana. Our findings demonstrate that it's feasible to develop longitudinal SCD registries in sub-Saharan Africa. These registries will be crucial for facilitating a range of studies, including the analysis of SCD clinical phenotypes and patient outcomes, newborn screening, and evaluation of hydroxyurea use, among others. This initiative underscores the potential for developing comprehensive disease registries in resource-limited settings, fostering collaborative, data-driven research efforts aimed at addressing the multifaceted challenges of SCD in Africa.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285451/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141792014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: This study aims to predict the trend of procurement and storage of various blood products, as well as planning and monitoring the consumption of blood products in different centers across Iran based on artificial intelligence until the year 2027.
Methods: This research constitutes a time-series investigation within the realm of longitudinal studies. In this study, information on the number of packed red blood cells (RBC), leukoreduced red blood cells (LR-RBC), and platelets (PLT), PLT-Apheresis, and fresh frozen plasma (FFP) was requested from all blood transfusion centers in the country and extracted using a unified protocol. After the initial examination of the information and addressing data issues and inconsistencies, the corrected data were analyzed. Both conventional and artificial intelligence approaches were used to predict each product in this study. The best model was selected based on goodness-of-fit indicators RMSE and MAPE.
Results: Based on the obtained results, the FFP product will follow a relatively consistent process similar to previous years in the next five years. The PLT product is predicted to have a growing trend over the next 5 years, which applies to both the demand and supply of the product. The PLT-Apheresis product also shows a similar upward trend, albeit with a lower growth rate. The RBC product will have a constant trend over a 5-year period (long-term) according to both models, taking into account short-term changes. Similarly, there is a similar trend in LR-RBC, with the expectation that short-term pattern repetition will continue over a 5-year period (long-term). Comparing the goodness-of-fit results, the LSTM model proved to be better for predicting the dominant blood products.
Conclusions: The growth of the elderly population and diseases related to old age, and on the other hand, the trend of increasing the consumption of the product with a short lifespan (PLT) requires the activation of the management of the patient's blood, especially in relation to this product in medical centers. The trend for other products in the next five years is similar to previous years, and no growth in demand is observed. The LSTM method, considering periodic and cyclical events, has performed the prediction.
{"title":"Long-term prediction of Iranian blood product supply using LSTM: a 5-year forecast.","authors":"Ebrahim Miri-Moghaddam, Saeede Khosravi Bizhaem, Zohre Moezzifar, Fatemeh Salmani","doi":"10.1186/s12911-024-02614-z","DOIUrl":"10.1186/s12911-024-02614-z","url":null,"abstract":"<p><strong>Background: </strong>This study aims to predict the trend of procurement and storage of various blood products, as well as planning and monitoring the consumption of blood products in different centers across Iran based on artificial intelligence until the year 2027.</p><p><strong>Methods: </strong>This research constitutes a time-series investigation within the realm of longitudinal studies. In this study, information on the number of packed red blood cells (RBC), leukoreduced red blood cells (LR-RBC), and platelets (PLT), PLT-Apheresis, and fresh frozen plasma (FFP) was requested from all blood transfusion centers in the country and extracted using a unified protocol. After the initial examination of the information and addressing data issues and inconsistencies, the corrected data were analyzed. Both conventional and artificial intelligence approaches were used to predict each product in this study. The best model was selected based on goodness-of-fit indicators RMSE and MAPE.</p><p><strong>Results: </strong>Based on the obtained results, the FFP product will follow a relatively consistent process similar to previous years in the next five years. The PLT product is predicted to have a growing trend over the next 5 years, which applies to both the demand and supply of the product. The PLT-Apheresis product also shows a similar upward trend, albeit with a lower growth rate. The RBC product will have a constant trend over a 5-year period (long-term) according to both models, taking into account short-term changes. Similarly, there is a similar trend in LR-RBC, with the expectation that short-term pattern repetition will continue over a 5-year period (long-term). Comparing the goodness-of-fit results, the LSTM model proved to be better for predicting the dominant blood products.</p><p><strong>Conclusions: </strong>The growth of the elderly population and diseases related to old age, and on the other hand, the trend of increasing the consumption of the product with a short lifespan (PLT) requires the activation of the management of the patient's blood, especially in relation to this product in medical centers. The trend for other products in the next five years is similar to previous years, and no growth in demand is observed. The LSTM method, considering periodic and cyclical events, has performed the prediction.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11288095/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141792013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-29DOI: 10.1186/s12911-024-02589-x
Nishchay Mehta, Baptiste Briot Ribeyre, Lilia Dimitrov, Louise J English, Colleen Ewart, Antje Heinrich, Nikhil Joshi, Kevin J Munro, Gail Roadknight, Luis Romao, Anne Gm Schilder, Ruth V Spriggs, Ruth Norris, Talisa Ross, George Tilston
Background: The National Institute of Health and Social Care Research (NIHR) Health Informatics Collaborative (HIC) for Hearing Health has been established in the UK to curate routinely collected hearing health data to address research questions. This study defines priority research areas, outlines its aims, governance structure and demonstrates how hearing health data have been integrated into a common data model using pure tone audiometry (PTA) as a case study.
Methods: After identifying key research aims in hearing health, the governance structure for the NIHR HIC for Hearing Health is described. The Observational Medical Outcomes Partnership (OMOP) was chosen as our common data model to provide a case study example.
Results: The NIHR HIC Hearing Health theme have developed a data architecture outlying the flow of data from all of the various siloed electronic patient record systems to allow the effective linkage of data from electronic patient record systems to research systems. Using PTAs as an example, OMOPification of hearing health data successfully collated a rich breadth of datapoints across multiple centres.
Conclusion: This study identified priority research areas where routinely collected hearing health data could be useful. It demonstrates integration and standardisation of such data into a common data model from multiple centres. By describing the process of data sharing across the HIC, we hope to invite more centres to contribute and utilise data to address research questions in hearing health. This national initiative has the power to transform UK hearing research and hearing care using routinely collected clinical data.
背景:英国国家健康与社会护理研究所(NIHR)成立了听力健康信息学合作组织(HIC),以整理常规收集的听力健康数据,解决研究问题。本研究定义了优先研究领域,概述了其目标和管理结构,并以纯音测听(PTA)为案例,展示了如何将听力健康数据整合到通用数据模型中:方法:在确定了听力健康的主要研究目标后,介绍了英国国家听力健康研究院听力健康信息中心的管理结构。我们选择了观察性医疗结果伙伴关系(OMOP)作为我们的通用数据模型,以提供一个案例研究范例:结果:NIHR HIC 听力健康主题开发了一个数据架构,将各种孤立的电子病历系统中的数据流外置,以便将电子病历系统中的数据有效连接到研究系统中。以 PTAs 为例,听力健康数据的 OMOPification 成功整理了多个中心的丰富数据点:本研究确定了常规收集的听力健康数据可能有用的优先研究领域。它展示了将这些数据整合到多个中心的通用数据模型中并使之标准化的过程。通过描述整个听力健康信息中心的数据共享过程,我们希望邀请更多中心提供并利用数据来解决听力健康方面的研究问题。这项全国性倡议能够利用常规收集的临床数据改变英国的听力研究和听力保健。
{"title":"Creating a health informatics data resource for hearing health research.","authors":"Nishchay Mehta, Baptiste Briot Ribeyre, Lilia Dimitrov, Louise J English, Colleen Ewart, Antje Heinrich, Nikhil Joshi, Kevin J Munro, Gail Roadknight, Luis Romao, Anne Gm Schilder, Ruth V Spriggs, Ruth Norris, Talisa Ross, George Tilston","doi":"10.1186/s12911-024-02589-x","DOIUrl":"10.1186/s12911-024-02589-x","url":null,"abstract":"<p><strong>Background: </strong>The National Institute of Health and Social Care Research (NIHR) Health Informatics Collaborative (HIC) for Hearing Health has been established in the UK to curate routinely collected hearing health data to address research questions. This study defines priority research areas, outlines its aims, governance structure and demonstrates how hearing health data have been integrated into a common data model using pure tone audiometry (PTA) as a case study.</p><p><strong>Methods: </strong>After identifying key research aims in hearing health, the governance structure for the NIHR HIC for Hearing Health is described. The Observational Medical Outcomes Partnership (OMOP) was chosen as our common data model to provide a case study example.</p><p><strong>Results: </strong>The NIHR HIC Hearing Health theme have developed a data architecture outlying the flow of data from all of the various siloed electronic patient record systems to allow the effective linkage of data from electronic patient record systems to research systems. Using PTAs as an example, OMOPification of hearing health data successfully collated a rich breadth of datapoints across multiple centres.</p><p><strong>Conclusion: </strong>This study identified priority research areas where routinely collected hearing health data could be useful. It demonstrates integration and standardisation of such data into a common data model from multiple centres. By describing the process of data sharing across the HIC, we hope to invite more centres to contribute and utilise data to address research questions in hearing health. This national initiative has the power to transform UK hearing research and hearing care using routinely collected clinical data.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285202/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141792011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-29DOI: 10.1186/s12911-024-02610-3
Michelle Pistner Nixon, Farhani Momotaz, Claire Smith, Jeffrey S Smith, Mark Sendak, Christopher Polage, Justin D Silverman
Background: A central goal of modern evidence-based medicine is the development of simple and easy to use tools that help clinicians integrate quantitative information into medical decision-making. The Bayesian Pre-test/Post-test Probability (BPP) framework is arguably the most well known of such tools and provides a formal approach to quantify diagnostic uncertainty given the result of a medical test or the presence of a clinical sign. Yet, clinical decision-making goes beyond quantifying diagnostic uncertainty and requires that that uncertainty be balanced against the various costs and benefits associated with each possible decision. Despite increasing attention in recent years, simple and flexible approaches to quantitative clinical decision-making have remained elusive.
Methods: We extend the BPP framework using concepts of Bayesian Decision Theory. By integrating cost, we can expand the BPP framework to allow for clinical decision-making.
Results: We develop a simple quantitative framework for binary clinical decisions (e.g., action/inaction, treat/no-treat, test/no-test). Let p be the pre-test or post-test probability that a patient has disease. We show that represents a critical value called a decision boundary. In terms of the relative cost of under- to over-acting, represents the critical value at which action and inaction are equally optimal. We demonstrate how this decision boundary can be used at the bedside through case studies and as a research tool through a reanalysis of a recent study which found widespread misestimation of pre-test and post-test probabilities among clinicians.
Conclusions: Our approach is so simple that it should be thought of as a core, yet previously overlooked, part of the BPP framework. Unlike prior approaches to quantitative clinical decision-making, our approach requires little more than a hand-held calculator, is applicable in almost any setting where the BPP framework can be used, and excels in situations where the costs and benefits associated with a particular decision are patient-specific and difficult to quantify.
{"title":"From pre-test and post-test probabilities to medical decision making.","authors":"Michelle Pistner Nixon, Farhani Momotaz, Claire Smith, Jeffrey S Smith, Mark Sendak, Christopher Polage, Justin D Silverman","doi":"10.1186/s12911-024-02610-3","DOIUrl":"10.1186/s12911-024-02610-3","url":null,"abstract":"<p><strong>Background: </strong>A central goal of modern evidence-based medicine is the development of simple and easy to use tools that help clinicians integrate quantitative information into medical decision-making. The Bayesian Pre-test/Post-test Probability (BPP) framework is arguably the most well known of such tools and provides a formal approach to quantify diagnostic uncertainty given the result of a medical test or the presence of a clinical sign. Yet, clinical decision-making goes beyond quantifying diagnostic uncertainty and requires that that uncertainty be balanced against the various costs and benefits associated with each possible decision. Despite increasing attention in recent years, simple and flexible approaches to quantitative clinical decision-making have remained elusive.</p><p><strong>Methods: </strong>We extend the BPP framework using concepts of Bayesian Decision Theory. By integrating cost, we can expand the BPP framework to allow for clinical decision-making.</p><p><strong>Results: </strong>We develop a simple quantitative framework for binary clinical decisions (e.g., action/inaction, treat/no-treat, test/no-test). Let p be the pre-test or post-test probability that a patient has disease. We show that <math> <mrow><mmultiscripts><mi>r</mi> <mrow></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> <mo>=</mo> <mrow><mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>p</mi> <mo>)</mo></mrow> <mo>/</mo> <mi>p</mi></mrow> </math> represents a critical value called a decision boundary. In terms of the relative cost of under- to over-acting, <math><mmultiscripts><mi>r</mi> <mrow></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> represents the critical value at which action and inaction are equally optimal. We demonstrate how this decision boundary can be used at the bedside through case studies and as a research tool through a reanalysis of a recent study which found widespread misestimation of pre-test and post-test probabilities among clinicians.</p><p><strong>Conclusions: </strong>Our approach is so simple that it should be thought of as a core, yet previously overlooked, part of the BPP framework. Unlike prior approaches to quantitative clinical decision-making, our approach requires little more than a hand-held calculator, is applicable in almost any setting where the BPP framework can be used, and excels in situations where the costs and benefits associated with a particular decision are patient-specific and difficult to quantify.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141792012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-29DOI: 10.1186/s12911-024-02600-5
Sumit Madan, Manuel Lentzen, Johannes Brandt, Daniel Rueckert, Martin Hofmann-Apitius, Holger Fröhlich
Deep neural networks (DNN) have fundamentally revolutionized the artificial intelligence (AI) field. The transformer model is a type of DNN that was originally used for the natural language processing tasks and has since gained more and more attention for processing various kinds of sequential data, including biological sequences and structured electronic health records. Along with this development, transformer-based models such as BioBERT, MedBERT, and MassGenie have been trained and deployed by researchers to answer various scientific questions originating in the biomedical domain. In this paper, we review the development and application of transformer models for analyzing various biomedical-related datasets such as biomedical textual data, protein sequences, medical structured-longitudinal data, and biomedical images as well as graphs. Also, we look at explainable AI strategies that help to comprehend the predictions of transformer-based models. Finally, we discuss the limitations and challenges of current models, and point out emerging novel research directions.
{"title":"Transformer models in biomedicine.","authors":"Sumit Madan, Manuel Lentzen, Johannes Brandt, Daniel Rueckert, Martin Hofmann-Apitius, Holger Fröhlich","doi":"10.1186/s12911-024-02600-5","DOIUrl":"10.1186/s12911-024-02600-5","url":null,"abstract":"<p><p>Deep neural networks (DNN) have fundamentally revolutionized the artificial intelligence (AI) field. The transformer model is a type of DNN that was originally used for the natural language processing tasks and has since gained more and more attention for processing various kinds of sequential data, including biological sequences and structured electronic health records. Along with this development, transformer-based models such as BioBERT, MedBERT, and MassGenie have been trained and deployed by researchers to answer various scientific questions originating in the biomedical domain. In this paper, we review the development and application of transformer models for analyzing various biomedical-related datasets such as biomedical textual data, protein sequences, medical structured-longitudinal data, and biomedical images as well as graphs. Also, we look at explainable AI strategies that help to comprehend the predictions of transformer-based models. Finally, we discuss the limitations and challenges of current models, and point out emerging novel research directions.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11287876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141792015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.1186/s12911-024-02583-3
Le Li, Jingyuan Guan, Xi Peng, Likun Zhou, Zhuxin Zhang, Ligang Ding, Lihui Zheng, Lingmin Wu, Zhicheng Hu, Limin Liu, Yan Yao
Introduction: Sepsis-associated acute kidney injury (SA-AKI) is strongly associated with poor prognosis. We aimed to build a machine learning (ML)-based clinical model to predict 1-year mortality in patients with SA-AKI.
Methods: Six ML algorithms were included to perform model fitting. Feature selection was based on the feature importance evaluated by the SHapley Additive exPlanations (SHAP) values. Area under the receiver operating characteristic curve (AUROC) was used to evaluate the discriminatory ability of the prediction model. Calibration curve and Brier score were employed to assess the calibrated ability. Our ML-based prediction models were validated both internally and externally.
Results: A total of 12,750 patients with SA-AKI and 55 features were included to build the prediction models. We identified the top 10 predictors including age, ICU stay and GCS score based on the feature importance. Among the six ML algorithms, the CatBoost showed the best prediction performance with an AUROC of 0.813 and Brier score of 0.119. In the external validation set, the predictive value remained favorable (AUROC = 0.784).
Conclusion: In this study, we developed and validated a ML-based prediction model based on 10 commonly used clinical features which could accurately and early identify the individuals at high-risk of long-term mortality in patients with SA-AKI.
简介败血症相关急性肾损伤(SA-AKI)与预后不良密切相关。我们旨在建立一个基于机器学习(ML)的临床模型,以预测SA-AKI患者的1年死亡率:方法:采用六种 ML 算法进行模型拟合。特征选择基于SHapley Additive exPlanations(SHAP)值评估的特征重要性。接收者操作特征曲线下面积(AUROC)用于评估预测模型的判别能力。校准曲线和布赖尔评分用于评估校准能力。我们对基于 ML 的预测模型进行了内部和外部验证:共纳入了 12,750 例 SA-AKI 患者和 55 个特征来建立预测模型。根据特征的重要性,我们确定了前 10 个预测因子,包括年龄、重症监护室住院时间和 GCS 评分。在六种 ML 算法中,CatBoost 的预测效果最好,AUROC 为 0.813,Brier 得分为 0.119。在外部验证集中,预测值仍然良好(AUROC = 0.784):在这项研究中,我们开发并验证了一个基于 10 个常用临床特征的多模型预测模型,该模型可以准确、早期地识别 SA-AKI 患者中的长期死亡率高危人群。
{"title":"Machine learning for the prediction of 1-year mortality in patients with sepsis-associated acute kidney injury.","authors":"Le Li, Jingyuan Guan, Xi Peng, Likun Zhou, Zhuxin Zhang, Ligang Ding, Lihui Zheng, Lingmin Wu, Zhicheng Hu, Limin Liu, Yan Yao","doi":"10.1186/s12911-024-02583-3","DOIUrl":"10.1186/s12911-024-02583-3","url":null,"abstract":"<p><strong>Introduction: </strong>Sepsis-associated acute kidney injury (SA-AKI) is strongly associated with poor prognosis. We aimed to build a machine learning (ML)-based clinical model to predict 1-year mortality in patients with SA-AKI.</p><p><strong>Methods: </strong>Six ML algorithms were included to perform model fitting. Feature selection was based on the feature importance evaluated by the SHapley Additive exPlanations (SHAP) values. Area under the receiver operating characteristic curve (AUROC) was used to evaluate the discriminatory ability of the prediction model. Calibration curve and Brier score were employed to assess the calibrated ability. Our ML-based prediction models were validated both internally and externally.</p><p><strong>Results: </strong>A total of 12,750 patients with SA-AKI and 55 features were included to build the prediction models. We identified the top 10 predictors including age, ICU stay and GCS score based on the feature importance. Among the six ML algorithms, the CatBoost showed the best prediction performance with an AUROC of 0.813 and Brier score of 0.119. In the external validation set, the predictive value remained favorable (AUROC = 0.784).</p><p><strong>Conclusion: </strong>In this study, we developed and validated a ML-based prediction model based on 10 commonly used clinical features which could accurately and early identify the individuals at high-risk of long-term mortality in patients with SA-AKI.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11271185/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141757240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.1186/s12911-024-02617-w
Bin Wang, Xia Huang, Guofang Liu, Taohua Zheng, Hui Lin, Yue Qiao, Wenjuan Sun
Objective: Based on the Omaha problem classification system, a sensitivity outcome index system for home nursing of elderly liver transplant patients was established.
Methods: Through a comprehensive literature review and rigorous application of the Delphi method, a panel of 20 experts completed two rounds of effective letter consultation to obtain expert consensus opinions. The contents of indicators were determined based on this process, and the analytic hierarchy process was employed to confirm the weightage assigned to each indicator. Consequently, we established a sensitivity outcome index system for home care in elderly liver transplant patients.
Results: The effective recovery rate of the questionnaire in two rounds of expert consultation was 100%, and the proportion of experts who gave opinions was 55% and 15%, respectively, indicating that the experts were highly active. The expert authority coefficients were calculated as 0.904 and 0.905, respectively, indicating a high degree of expert authority. In the second round, Kendall's coordination coefficients for primary, secondary, and tertiary indicators were determined to be 0.419, 0.418, and 0.394 (P < 0.001), indicating that expert opinions tended to be consistent. Finally, we established a comprehensive sensitivity outcome index system comprising 4 first-level indexes, 20 s-level indexes, and 72 third-level indexes specifically designed for elderly liver transplantation patients.
Conclusion: The sensitivity outcome index system of home nursing for elderly liver transplant patients can provide theoretical basis for nursing staff to build accurate individualized continuous nursing model.
{"title":"The sensitivity outcome index system for home care of elderly liver transplant patients was developed based on the Omaha problem classification system.","authors":"Bin Wang, Xia Huang, Guofang Liu, Taohua Zheng, Hui Lin, Yue Qiao, Wenjuan Sun","doi":"10.1186/s12911-024-02617-w","DOIUrl":"10.1186/s12911-024-02617-w","url":null,"abstract":"<p><strong>Objective: </strong>Based on the Omaha problem classification system, a sensitivity outcome index system for home nursing of elderly liver transplant patients was established.</p><p><strong>Methods: </strong>Through a comprehensive literature review and rigorous application of the Delphi method, a panel of 20 experts completed two rounds of effective letter consultation to obtain expert consensus opinions. The contents of indicators were determined based on this process, and the analytic hierarchy process was employed to confirm the weightage assigned to each indicator. Consequently, we established a sensitivity outcome index system for home care in elderly liver transplant patients.</p><p><strong>Results: </strong>The effective recovery rate of the questionnaire in two rounds of expert consultation was 100%, and the proportion of experts who gave opinions was 55% and 15%, respectively, indicating that the experts were highly active. The expert authority coefficients were calculated as 0.904 and 0.905, respectively, indicating a high degree of expert authority. In the second round, Kendall's coordination coefficients for primary, secondary, and tertiary indicators were determined to be 0.419, 0.418, and 0.394 (P < 0.001), indicating that expert opinions tended to be consistent. Finally, we established a comprehensive sensitivity outcome index system comprising 4 first-level indexes, 20 s-level indexes, and 72 third-level indexes specifically designed for elderly liver transplantation patients.</p><p><strong>Conclusion: </strong>The sensitivity outcome index system of home nursing for elderly liver transplant patients can provide theoretical basis for nursing staff to build accurate individualized continuous nursing model.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11270964/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141757241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-24DOI: 10.1186/s12911-024-02612-1
Minghui Sun, Matthew M Engelhard, Armando D Bedoya, Benjamin A Goldstein
Background: Electronic Health Records (EHR) are widely used to develop clinical prediction models (CPMs). However, one of the challenges is that there is often a degree of informative missing data. For example, laboratory measures are typically taken when a clinician is concerned that there is a need. When data are the so-called Not Missing at Random (NMAR), analytic strategies based on other missingness mechanisms are inappropriate. In this work, we seek to compare the impact of different strategies for handling missing data on CPMs performance.
Methods: We considered a predictive model for rapid inpatient deterioration as an exemplar implementation. This model incorporated twelve laboratory measures with varying levels of missingness. Five labs had missingness rate levels around 50%, and the other seven had missingness levels around 90%. We included them based on the belief that their missingness status can be highly informational for the prediction. In our study, we explicitly compared the various missing data strategies: mean imputation, normal-value imputation, conditional imputation, categorical encoding, and missingness embeddings. Some of these were also combined with the last observation carried forward (LOCF). We implemented logistic LASSO regression, multilayer perceptron (MLP), and long short-term memory (LSTM) models as the downstream classifiers. We compared the AUROC of testing data and used bootstrapping to construct 95% confidence intervals.
Results: We had 105,198 inpatient encounters, with 4.7% having experienced the deterioration outcome of interest. LSTM models generally outperformed other cross-sectional models, where embedding approaches and categorical encoding yielded the best results. For the cross-sectional models, normal-value imputation with LOCF generated the best results.
Conclusion: Strategies that accounted for the possibility of NMAR missing data yielded better model performance than those did not. The embedding method had an advantage as it did not require prior clinical knowledge. Using LOCF could enhance the performance of cross-sectional models but have countereffects in LSTM models.
{"title":"Incorporating informatively collected laboratory data from EHR in clinical prediction models.","authors":"Minghui Sun, Matthew M Engelhard, Armando D Bedoya, Benjamin A Goldstein","doi":"10.1186/s12911-024-02612-1","DOIUrl":"10.1186/s12911-024-02612-1","url":null,"abstract":"<p><strong>Background: </strong>Electronic Health Records (EHR) are widely used to develop clinical prediction models (CPMs). However, one of the challenges is that there is often a degree of informative missing data. For example, laboratory measures are typically taken when a clinician is concerned that there is a need. When data are the so-called Not Missing at Random (NMAR), analytic strategies based on other missingness mechanisms are inappropriate. In this work, we seek to compare the impact of different strategies for handling missing data on CPMs performance.</p><p><strong>Methods: </strong>We considered a predictive model for rapid inpatient deterioration as an exemplar implementation. This model incorporated twelve laboratory measures with varying levels of missingness. Five labs had missingness rate levels around 50%, and the other seven had missingness levels around 90%. We included them based on the belief that their missingness status can be highly informational for the prediction. In our study, we explicitly compared the various missing data strategies: mean imputation, normal-value imputation, conditional imputation, categorical encoding, and missingness embeddings. Some of these were also combined with the last observation carried forward (LOCF). We implemented logistic LASSO regression, multilayer perceptron (MLP), and long short-term memory (LSTM) models as the downstream classifiers. We compared the AUROC of testing data and used bootstrapping to construct 95% confidence intervals.</p><p><strong>Results: </strong>We had 105,198 inpatient encounters, with 4.7% having experienced the deterioration outcome of interest. LSTM models generally outperformed other cross-sectional models, where embedding approaches and categorical encoding yielded the best results. For the cross-sectional models, normal-value imputation with LOCF generated the best results.</p><p><strong>Conclusion: </strong>Strategies that accounted for the possibility of NMAR missing data yielded better model performance than those did not. The embedding method had an advantage as it did not require prior clinical knowledge. Using LOCF could enhance the performance of cross-sectional models but have countereffects in LSTM models.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11270887/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141757238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-24DOI: 10.1186/s12911-024-02609-w
Jocelyn Dunstan, Thomas Vakili, Luis Miranda, Fabián Villena, Claudio Aracena, Tamara Quiroga, Paulina Vera, Sebastián Viteri Valenzuela, Victor Rocco
Despite the high creation cost, annotated corpora are indispensable for robust natural language processing systems. In the clinical field, in addition to annotating medical entities, corpus creators must also remove personally identifiable information (PII). This has become increasingly important in the era of large language models where unwanted memorization can occur. This paper presents a corpus annotated to anonymize personally identifiable information in 1,787 anamneses of work-related accidents and diseases in Spanish. Additionally, we applied a previously released model for Named Entity Recognition (NER) trained on referrals from primary care physicians to identify diseases, body parts, and medications in this work-related text. We analyzed the differences between the models and the gold standard curated by a physician in detail. Moreover, we compared the performance of the NER model on the original narratives, in narratives where personal information has been masked, and in texts where the personal data is replaced by another similar surrogate value (pseudonymization). Within this publication, we share the annotation guidelines and the annotated corpus.
尽管创建成本高昂,但附加注释的语料库对于强大的自然语言处理系统来说是不可或缺的。在临床领域,除了注释医学实体外,语料库创建者还必须删除个人身份信息(PII)。在大型语言模型时代,这一点变得越来越重要,因为在大型语言模型中可能会出现不必要的记忆。本文介绍了一个语料库,该语料库注释了 1,787 份与工作相关的事故和疾病的西班牙语病历,对其中的个人身份信息进行了匿名处理。此外,我们还应用了之前发布的一个命名实体识别(NER)模型,该模型以初级保健医生的转诊为基础进行训练,以识别这些与工作相关的文本中的疾病、身体部位和药物。我们详细分析了这些模型与医生策划的黄金标准之间的差异。此外,我们还比较了 NER 模型在原始叙述中、在个人信息被掩盖的叙述中以及在个人数据被另一个类似的替代值(化名)取代的文本中的性能。在本出版物中,我们分享了注释指南和注释语料库。
{"title":"A pseudonymized corpus of occupational health narratives for clinical entity recognition in Spanish.","authors":"Jocelyn Dunstan, Thomas Vakili, Luis Miranda, Fabián Villena, Claudio Aracena, Tamara Quiroga, Paulina Vera, Sebastián Viteri Valenzuela, Victor Rocco","doi":"10.1186/s12911-024-02609-w","DOIUrl":"10.1186/s12911-024-02609-w","url":null,"abstract":"<p><p>Despite the high creation cost, annotated corpora are indispensable for robust natural language processing systems. In the clinical field, in addition to annotating medical entities, corpus creators must also remove personally identifiable information (PII). This has become increasingly important in the era of large language models where unwanted memorization can occur. This paper presents a corpus annotated to anonymize personally identifiable information in 1,787 anamneses of work-related accidents and diseases in Spanish. Additionally, we applied a previously released model for Named Entity Recognition (NER) trained on referrals from primary care physicians to identify diseases, body parts, and medications in this work-related text. We analyzed the differences between the models and the gold standard curated by a physician in detail. Moreover, we compared the performance of the NER model on the original narratives, in narratives where personal information has been masked, and in texts where the personal data is replaced by another similar surrogate value (pseudonymization). Within this publication, we share the annotation guidelines and the annotated corpus.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11267746/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141757237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}