Pub Date : 2021-07-14DOI: 10.1186/s41512-021-00102-w
Qian M Zhou, Lu Zhe, Russell J Brooke, Melissa M Hudson, Yan Yuan
Background: Incremental value (IncV) evaluates the performance change between an existing risk model and a new model. Different IncV metrics do not always agree with each other. For example, compared with a prescribed-dose model, an ovarian-dose model for predicting acute ovarian failure has a slightly lower area under the receiver operating characteristic curve (AUC) but increases the area under the precision-recall curve (AP) by 48%. This phenomenon of disagreement is not uncommon, and can create confusion when assessing whether the added information improves the model prediction accuracy.
Methods: In this article, we examine the analytical connections and differences between the AUC IncV (ΔAUC) and AP IncV (ΔAP). We also compare the true values of these two IncV metrics in a numerical study. Additionally, as both are semi-proper scoring rules, we compare them with a strictly proper scoring rule: the IncV of the scaled Brier score (ΔsBrS) in the numerical study.
Results: We demonstrate that ΔAUC and ΔAP are both weighted averages of the changes (from the existing model to the new one) in separating the risk score distributions between events and non-events. However, ΔAP assigns heavier weights to the changes in higher-risk regions, whereas ΔAUC weights the changes equally. Due to this difference, the two IncV metrics can disagree, and the numerical study shows that their disagreement becomes more pronounced as the event rate decreases. In the numerical study, we also find that ΔAP has a wide range, from negative to positive, but the range of ΔAUC is much smaller. In addition, ΔAP and ΔsBrS are highly consistent, but ΔAUC is negatively correlated with ΔsBrS and ΔAP when the event rate is low.
Conclusions: ΔAUC treats the wins and losses of a new risk model equally across different risk regions. When neither the existing or new model is the true model, this equality could attenuate a superior performance of the new model for a sub-region. In contrast, ΔAP accentuates the change in the prediction accuracy for higher-risk regions.
{"title":"A relationship between the incremental values of area under the ROC curve and of area under the precision-recall curve.","authors":"Qian M Zhou, Lu Zhe, Russell J Brooke, Melissa M Hudson, Yan Yuan","doi":"10.1186/s41512-021-00102-w","DOIUrl":"https://doi.org/10.1186/s41512-021-00102-w","url":null,"abstract":"<p><strong>Background: </strong>Incremental value (IncV) evaluates the performance change between an existing risk model and a new model. Different IncV metrics do not always agree with each other. For example, compared with a prescribed-dose model, an ovarian-dose model for predicting acute ovarian failure has a slightly lower area under the receiver operating characteristic curve (AUC) but increases the area under the precision-recall curve (AP) by 48%. This phenomenon of disagreement is not uncommon, and can create confusion when assessing whether the added information improves the model prediction accuracy.</p><p><strong>Methods: </strong>In this article, we examine the analytical connections and differences between the AUC IncV (ΔAUC) and AP IncV (ΔAP). We also compare the true values of these two IncV metrics in a numerical study. Additionally, as both are semi-proper scoring rules, we compare them with a strictly proper scoring rule: the IncV of the scaled Brier score (ΔsBrS) in the numerical study.</p><p><strong>Results: </strong>We demonstrate that ΔAUC and ΔAP are both weighted averages of the changes (from the existing model to the new one) in separating the risk score distributions between events and non-events. However, ΔAP assigns heavier weights to the changes in higher-risk regions, whereas ΔAUC weights the changes equally. Due to this difference, the two IncV metrics can disagree, and the numerical study shows that their disagreement becomes more pronounced as the event rate decreases. In the numerical study, we also find that ΔAP has a wide range, from negative to positive, but the range of ΔAUC is much smaller. In addition, ΔAP and ΔsBrS are highly consistent, but ΔAUC is negatively correlated with ΔsBrS and ΔAP when the event rate is low.</p><p><strong>Conclusions: </strong>ΔAUC treats the wins and losses of a new risk model equally across different risk regions. When neither the existing or new model is the true model, this equality could attenuate a superior performance of the new model for a sub-region. In contrast, ΔAP accentuates the change in the prediction accuracy for higher-risk regions.</p>","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"13"},"PeriodicalIF":0.0,"publicationDate":"2021-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s41512-021-00102-w","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39184419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-02DOI: 10.1186/s41512-021-00101-x
Andrew S Moriarty, Lewis W Paton, Kym I E Snell, Richard D Riley, Joshua E J Buckman, Simon Gilbody, Carolyn A Chew-Graham, Shehzad Ali, Stephen Pilling, Nick Meader, Bob Phillips, Peter A Coventry, Jaime Delgadillo, David A Richards, Chris Salisbury, Dean McMillan
Background: Most patients who present with depression are treated in primary care by general practitioners (GPs). Relapse of depression is common (at least 50% of patients treated for depression will relapse after a single episode) and leads to considerable morbidity and decreased quality of life for patients. The majority of patients will relapse within 6 months, and those with a history of relapse are more likely to relapse in the future than those with no such history. GPs see a largely undifferentiated case-mix of patients, and once patients with depression reach remission, there is limited guidance to help GPs stratify patients according to risk of relapse. We aim to develop a prognostic model to predict an individual's risk of relapse within 6-8 months of entering remission. The long-term objective is to inform the clinical management of depression after the acute phase.
Methods: We will develop a prognostic model using secondary analysis of individual participant data drawn from seven RCTs and one longitudinal cohort study in primary or community care settings. We will use logistic regression to predict the outcome of relapse of depression within 6-8 months. We plan to include the following established relapse predictors in the model: residual depressive symptoms, number of previous depressive episodes, co-morbid anxiety and severity of index episode. We will use a "full model" development approach, including all available predictors. Performance statistics (optimism-adjusted C-statistic, calibration-in-the-large, calibration slope) and calibration plots (with smoothed calibration curves) will be calculated. Generalisability of predictive performance will be assessed through internal-external cross-validation. Clinical utility will be explored through net benefit analysis.
Discussion: We will derive a statistical model to predict relapse of depression in remitted depressed patients in primary care. Assuming the model has sufficient predictive performance, we outline the next steps including independent external validation and further assessment of clinical utility and impact.
Study registration: ClinicalTrials.gov ID: NCT04666662.
{"title":"The development and validation of a prognostic model to PREDICT Relapse of depression in adult patients in primary care: protocol for the PREDICTR study.","authors":"Andrew S Moriarty, Lewis W Paton, Kym I E Snell, Richard D Riley, Joshua E J Buckman, Simon Gilbody, Carolyn A Chew-Graham, Shehzad Ali, Stephen Pilling, Nick Meader, Bob Phillips, Peter A Coventry, Jaime Delgadillo, David A Richards, Chris Salisbury, Dean McMillan","doi":"10.1186/s41512-021-00101-x","DOIUrl":"10.1186/s41512-021-00101-x","url":null,"abstract":"<p><strong>Background: </strong>Most patients who present with depression are treated in primary care by general practitioners (GPs). Relapse of depression is common (at least 50% of patients treated for depression will relapse after a single episode) and leads to considerable morbidity and decreased quality of life for patients. The majority of patients will relapse within 6 months, and those with a history of relapse are more likely to relapse in the future than those with no such history. GPs see a largely undifferentiated case-mix of patients, and once patients with depression reach remission, there is limited guidance to help GPs stratify patients according to risk of relapse. We aim to develop a prognostic model to predict an individual's risk of relapse within 6-8 months of entering remission. The long-term objective is to inform the clinical management of depression after the acute phase.</p><p><strong>Methods: </strong>We will develop a prognostic model using secondary analysis of individual participant data drawn from seven RCTs and one longitudinal cohort study in primary or community care settings. We will use logistic regression to predict the outcome of relapse of depression within 6-8 months. We plan to include the following established relapse predictors in the model: residual depressive symptoms, number of previous depressive episodes, co-morbid anxiety and severity of index episode. We will use a \"full model\" development approach, including all available predictors. Performance statistics (optimism-adjusted C-statistic, calibration-in-the-large, calibration slope) and calibration plots (with smoothed calibration curves) will be calculated. Generalisability of predictive performance will be assessed through internal-external cross-validation. Clinical utility will be explored through net benefit analysis.</p><p><strong>Discussion: </strong>We will derive a statistical model to predict relapse of depression in remitted depressed patients in primary care. Assuming the model has sufficient predictive performance, we outline the next steps including independent external validation and further assessment of clinical utility and impact.</p><p><strong>Study registration: </strong>ClinicalTrials.gov ID: NCT04666662.</p>","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"5 1","pages":"12"},"PeriodicalIF":0.0,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s41512-021-00101-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9518162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-20DOI: 10.1186/s41512-021-00098-3
Lisa Shaw, Sara Graziadio, Clare Lendrem, Nicholas Dale, Gary A Ford, Christine Roffe, Craig J Smith, Philip M White, Christopher I Price
Background: Rapid treatment of stroke improves outcomes, but accurate early recognition can be challenging. Between 20 and 40% of patients suspected to have stroke by ambulance and emergency department staff later receive a non-stroke 'mimic' diagnosis after stroke specialist investigation. This early diagnostic uncertainty results in displacement of mimic patients from more appropriate services, inappropriate demands on stroke specialist resources and delayed access to specialist therapies for stroke patients. Blood purine concentrations rise rapidly during hypoxic tissue injury, which is a key mechanism of damage during acute stroke but is not typical in mimic conditions. A portable point of care fingerprick test has been developed to measure blood purine concentration which could be used to triage patients experiencing suspected stroke symptoms into those likely to have a non-stroke mimic condition and those likely to have true stroke. This study is evaluating test performance for identification of stroke mimic conditions.
Methods: Design: prospective observational cohort study Setting: regional UK ambulance and acute stroke services Participants: a convenience series of two populations will be tested: adults with a label of suspected stroke assigned (and tested) by attending ambulance personnel and adults with a label of suspected stroke assigned at hospital (who have not been tested by ambulance staff).
Index test: SMARTChip Purine assay Reference standard tests: expert clinician opinion informed by brain imaging and/or other investigations will assign the following diagnoses which constitute the suspected stroke population: ischaemic stroke, haemorrhagic stroke, TIA and stroke mimic conditions.
Sample size: ambulance population (powered for mimic sensitivity) 935 participants; hospital population (powered for mimic specificity) 377 participants.
Analyses: area under the receiver operating curve (ROC) and optimal sensitivity, specificity, and negative and positive predictive values for identification of mimic conditions. Optimal threshold for the ambulance population will maximise sensitivity, minimum 80%, and aim to keep specificity above 70%. Optimal threshold for the hospital population will maximise specificity, minimum 80%, and aim to keep sensitivity above 70%.
Discussion: The results from this study will determine how accurately the SMARTChip purine assay test can identify stroke mimic conditions within the suspected stroke population. If acceptable performance is confirmed, deployment of the test in ambulances or emergency departments could enable more appropriate direction of patients to stroke or non-stroke services.
Trial registration: Registered with ISRCTN (identifier: ISRCTN22323981) on 13/02/2019 http://www.isrctn.com/ISRCTN22323981.
{"title":"Purines for Rapid Identification of Stroke Mimics (PRISM): study protocol for a diagnostic accuracy study.","authors":"Lisa Shaw, Sara Graziadio, Clare Lendrem, Nicholas Dale, Gary A Ford, Christine Roffe, Craig J Smith, Philip M White, Christopher I Price","doi":"10.1186/s41512-021-00098-3","DOIUrl":"10.1186/s41512-021-00098-3","url":null,"abstract":"<p><strong>Background: </strong>Rapid treatment of stroke improves outcomes, but accurate early recognition can be challenging. Between 20 and 40% of patients suspected to have stroke by ambulance and emergency department staff later receive a non-stroke 'mimic' diagnosis after stroke specialist investigation. This early diagnostic uncertainty results in displacement of mimic patients from more appropriate services, inappropriate demands on stroke specialist resources and delayed access to specialist therapies for stroke patients. Blood purine concentrations rise rapidly during hypoxic tissue injury, which is a key mechanism of damage during acute stroke but is not typical in mimic conditions. A portable point of care fingerprick test has been developed to measure blood purine concentration which could be used to triage patients experiencing suspected stroke symptoms into those likely to have a non-stroke mimic condition and those likely to have true stroke. This study is evaluating test performance for identification of stroke mimic conditions.</p><p><strong>Methods: </strong>Design: prospective observational cohort study Setting: regional UK ambulance and acute stroke services Participants: a convenience series of two populations will be tested: adults with a label of suspected stroke assigned (and tested) by attending ambulance personnel and adults with a label of suspected stroke assigned at hospital (who have not been tested by ambulance staff).</p><p><strong>Index test: </strong>SMARTChip Purine assay Reference standard tests: expert clinician opinion informed by brain imaging and/or other investigations will assign the following diagnoses which constitute the suspected stroke population: ischaemic stroke, haemorrhagic stroke, TIA and stroke mimic conditions.</p><p><strong>Sample size: </strong>ambulance population (powered for mimic sensitivity) 935 participants; hospital population (powered for mimic specificity) 377 participants.</p><p><strong>Analyses: </strong>area under the receiver operating curve (ROC) and optimal sensitivity, specificity, and negative and positive predictive values for identification of mimic conditions. Optimal threshold for the ambulance population will maximise sensitivity, minimum 80%, and aim to keep specificity above 70%. Optimal threshold for the hospital population will maximise specificity, minimum 80%, and aim to keep sensitivity above 70%.</p><p><strong>Discussion: </strong>The results from this study will determine how accurately the SMARTChip purine assay test can identify stroke mimic conditions within the suspected stroke population. If acceptable performance is confirmed, deployment of the test in ambulances or emergency departments could enable more appropriate direction of patients to stroke or non-stroke services.</p><p><strong>Trial registration: </strong>Registered with ISRCTN (identifier: ISRCTN22323981) on 13/02/2019 http://www.isrctn.com/ISRCTN22323981.</p>","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"11"},"PeriodicalIF":0.0,"publicationDate":"2021-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s41512-021-00098-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39003884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-05-18DOI: 10.1186/s41512-021-00100-y
Sarah Milosevic, Natalie Joseph-Williams, Bethan Pell, Elizabeth Cain, Robyn Hackett, Ffion Murdoch, Haroon Ahmed, A Joy Allen, Alison Bray, Samantha Clarke, Marcus J Drake, Michael Drinnan, Kerenza Hood, Tom Schatzberger, Yemisi Takwoingi, Emma Thomas-Jones, Raymond White, Adrian Edwards, Chris Harding
Background: Invasive urodynamics is used to investigate the causes of lower urinary tract symptoms; a procedure usually conducted in secondary care by specialist practitioners. No study has yet investigated the feasibility of carrying out this procedure in a non-specialist setting. Therefore, the aim of this study was to explore, using qualitative methodology, the feasibility and acceptability of conducting invasive urodynamic testing in primary care.
Methods: Semi-structured interviews were conducted during the pilot phase of the PriMUS study, in which men experiencing bothersome lower urinary tract symptoms underwent invasive urodynamic testing along with a series of simple index tests in a primary care setting. Interviewees were 25 patients invited to take part in the PriMUS study and 18 healthcare professionals involved in study delivery. Interviews were audio-recorded, transcribed verbatim and analysed using a framework approach.
Results: Patients generally found the urodynamic procedure acceptable and valued the primary care setting due to its increased accessibility and familiarity. Despite some logistical issues, facilitating invasive urodynamic testing in primary care was also a positive experience for urodynamic nurses. Initial issues with general practitioners receiving and utilising the results of urodynamic testing may have limited the potential benefit to some patients. Effective approaches to study recruitment included emphasising the benefits of the urodynamic test and maintaining contact with potential participants by telephone. Patients' relationship with their general practitioner was an important influence on study participation.
Conclusions: Conducting invasive urodynamics in primary care is feasible and acceptable and has the potential to benefit patients. Facilitating study procedures in a familiar primary care setting can impact positively on research recruitment. However, it is vital that there is a support network for urodynamic nurses and expertise available to help interpret urodynamic results.
{"title":"Conducting invasive urodynamics in primary care: qualitative interview study examining experiences of patients and healthcare professionals.","authors":"Sarah Milosevic, Natalie Joseph-Williams, Bethan Pell, Elizabeth Cain, Robyn Hackett, Ffion Murdoch, Haroon Ahmed, A Joy Allen, Alison Bray, Samantha Clarke, Marcus J Drake, Michael Drinnan, Kerenza Hood, Tom Schatzberger, Yemisi Takwoingi, Emma Thomas-Jones, Raymond White, Adrian Edwards, Chris Harding","doi":"10.1186/s41512-021-00100-y","DOIUrl":"https://doi.org/10.1186/s41512-021-00100-y","url":null,"abstract":"<p><strong>Background: </strong>Invasive urodynamics is used to investigate the causes of lower urinary tract symptoms; a procedure usually conducted in secondary care by specialist practitioners. No study has yet investigated the feasibility of carrying out this procedure in a non-specialist setting. Therefore, the aim of this study was to explore, using qualitative methodology, the feasibility and acceptability of conducting invasive urodynamic testing in primary care.</p><p><strong>Methods: </strong>Semi-structured interviews were conducted during the pilot phase of the PriMUS study, in which men experiencing bothersome lower urinary tract symptoms underwent invasive urodynamic testing along with a series of simple index tests in a primary care setting. Interviewees were 25 patients invited to take part in the PriMUS study and 18 healthcare professionals involved in study delivery. Interviews were audio-recorded, transcribed verbatim and analysed using a framework approach.</p><p><strong>Results: </strong>Patients generally found the urodynamic procedure acceptable and valued the primary care setting due to its increased accessibility and familiarity. Despite some logistical issues, facilitating invasive urodynamic testing in primary care was also a positive experience for urodynamic nurses. Initial issues with general practitioners receiving and utilising the results of urodynamic testing may have limited the potential benefit to some patients. Effective approaches to study recruitment included emphasising the benefits of the urodynamic test and maintaining contact with potential participants by telephone. Patients' relationship with their general practitioner was an important influence on study participation.</p><p><strong>Conclusions: </strong>Conducting invasive urodynamics in primary care is feasible and acceptable and has the potential to benefit patients. Facilitating study procedures in a familiar primary care setting can impact positively on research recruitment. However, it is vital that there is a support network for urodynamic nurses and expertise available to help interpret urodynamic results.</p>","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"10"},"PeriodicalIF":0.0,"publicationDate":"2021-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s41512-021-00100-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39007458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-16DOI: 10.1186/s41512-021-00099-2
Ross Bicknell, W. Lim, A. Maier, D. Logiudice
{"title":"Correction to: A study protocol for the development of a multivariable model predicting 6- and 12-month mortality for people with dementia living in residential aged care facilities (RACFs) in Australia","authors":"Ross Bicknell, W. Lim, A. Maier, D. Logiudice","doi":"10.1186/s41512-021-00099-2","DOIUrl":"https://doi.org/10.1186/s41512-021-00099-2","url":null,"abstract":"","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48764842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-02DOI: 10.1186/s41512-021-00097-4
Daniël A Korevaar, Patrick M Bossuyt, Matthew D F McInnes, Jérémie F Cohen
{"title":"PRISMA-DTA for Abstracts: a new addition to the toolbox for test accuracy research.","authors":"Daniël A Korevaar, Patrick M Bossuyt, Matthew D F McInnes, Jérémie F Cohen","doi":"10.1186/s41512-021-00097-4","DOIUrl":"https://doi.org/10.1186/s41512-021-00097-4","url":null,"abstract":"","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2021-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8017829/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25540791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1186/s41512-021-00094-7
{"title":"Methods for Evaluation of medical prediction Models, Tests And Biomarkers (MEMTAB) 2020 Symposium : Virtual. 10-11 December 2020.","authors":"","doi":"10.1186/s41512-021-00094-7","DOIUrl":"https://doi.org/10.1186/s41512-021-00094-7","url":null,"abstract":"","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":"5 Suppl 1","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s41512-021-00094-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25535480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-03-22DOI: 10.1186/s41512-021-00096-5
Evangelia Christodoulou, Maarten van Smeden, Michael Edlinger, Dirk Timmerman, Maria Wanitschek, Ewout W Steyerberg, Ben Van Calster
Background: We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in.
Methods: We illustrate the approach using data for the diagnosis of ovarian cancer (n = 5914, 33% event fraction) and obstructive coronary artery disease (CAD; n = 4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000 and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥ 0.9 and optimism in the c-statistic (or AUC) < = 0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors and correcting for bias on the model estimates (Firth's correction).
Results: Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24) and 850 patients (750-900) for the CAD data (33 EPP, 30-35). A stricter criterion, requiring AUC optimism < = 0.01, was met with a median of 500 (23 EPP) and 1500 (59 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth's correction was used.
Conclusions: Adaptive sample size determination can be a useful supplement to fixed a priori sample size calculations, because it allows to tailor the sample size to the specific prediction modeling context in a dynamic fashion.
{"title":"Adaptive sample size determination for the development of clinical prediction models.","authors":"Evangelia Christodoulou, Maarten van Smeden, Michael Edlinger, Dirk Timmerman, Maria Wanitschek, Ewout W Steyerberg, Ben Van Calster","doi":"10.1186/s41512-021-00096-5","DOIUrl":"https://doi.org/10.1186/s41512-021-00096-5","url":null,"abstract":"<p><strong>Background: </strong>We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in.</p><p><strong>Methods: </strong>We illustrate the approach using data for the diagnosis of ovarian cancer (n = 5914, 33% event fraction) and obstructive coronary artery disease (CAD; n = 4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000 and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥ 0.9 and optimism in the c-statistic (or AUC) < = 0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors and correcting for bias on the model estimates (Firth's correction).</p><p><strong>Results: </strong>Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24) and 850 patients (750-900) for the CAD data (33 EPP, 30-35). A stricter criterion, requiring AUC optimism < = 0.01, was met with a median of 500 (23 EPP) and 1500 (59 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth's correction was used.</p><p><strong>Conclusions: </strong>Adaptive sample size determination can be a useful supplement to fixed a priori sample size calculations, because it allows to tailor the sample size to the specific prediction modeling context in a dynamic fashion.</p>","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s41512-021-00096-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25511209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-03-08DOI: 10.1186/s41512-021-00095-6
Stephanie H Read, Laura C Rosella, Howard Berger, Denice S Feig, Karen Fleming, Padma Kaul, Joel G Ray, Baiju R Shah, Lorraine L Lipscombe
Background: Pregnancy offers a unique opportunity to identify women at higher future risk of type 2 diabetes mellitus (DM). In pregnancy, a woman has greater engagement with the healthcare system, and certain conditions are more apt to manifest, such as gestational DM (GDM) that are important markers for future DM risk. This study protocol describes the development and validation of a risk prediction model (RPM) for estimating a woman's 5-year risk of developing type 2 DM after pregnancy.
Methods: Data will be obtained from existing Ontario population-based administrative datasets. The derivation cohort will consist of all women who gave birth in Ontario, Canada between April 2006 and March 2014. Pre-specified predictors will include socio-demographic factors (age at delivery, ethnicity), maternal clinical factors (e.g., body mass index), pregnancy-related events (gestational DM, hypertensive disorders of pregnancy), and newborn factors (birthweight percentile). Incident type 2 DM will be identified by linkage to the Ontario Diabetes Database. Weibull accelerated failure time models will be developed to predict 5-year risk of type 2 DM. Measures of predictive accuracy (Nagelkerke's R2), discrimination (C-statistics), and calibration plots will be generated. Internal validation will be conducted using a bootstrapping approach in 500 samples with replacement, and an optimism-corrected C-statistic will be calculated. External validation of the RPM will be conducted by applying the model in a large population-based pregnancy cohort in Alberta, and estimating the above measures of model performance. The model will be re-calibrated by adjusting baseline hazards and coefficients where appropriate.
Discussion: The derived RPM may help identify women at high risk of developing DM in a 5-year period after pregnancy, thus facilitate lifestyle changes for women at higher risk, as well as more frequent screening for type 2 DM after pregnancy.
{"title":"Diabetes after pregnancy: a study protocol for the derivation and validation of a risk prediction model for 5-year risk of diabetes following pregnancy.","authors":"Stephanie H Read, Laura C Rosella, Howard Berger, Denice S Feig, Karen Fleming, Padma Kaul, Joel G Ray, Baiju R Shah, Lorraine L Lipscombe","doi":"10.1186/s41512-021-00095-6","DOIUrl":"10.1186/s41512-021-00095-6","url":null,"abstract":"<p><strong>Background: </strong>Pregnancy offers a unique opportunity to identify women at higher future risk of type 2 diabetes mellitus (DM). In pregnancy, a woman has greater engagement with the healthcare system, and certain conditions are more apt to manifest, such as gestational DM (GDM) that are important markers for future DM risk. This study protocol describes the development and validation of a risk prediction model (RPM) for estimating a woman's 5-year risk of developing type 2 DM after pregnancy.</p><p><strong>Methods: </strong>Data will be obtained from existing Ontario population-based administrative datasets. The derivation cohort will consist of all women who gave birth in Ontario, Canada between April 2006 and March 2014. Pre-specified predictors will include socio-demographic factors (age at delivery, ethnicity), maternal clinical factors (e.g., body mass index), pregnancy-related events (gestational DM, hypertensive disorders of pregnancy), and newborn factors (birthweight percentile). Incident type 2 DM will be identified by linkage to the Ontario Diabetes Database. Weibull accelerated failure time models will be developed to predict 5-year risk of type 2 DM. Measures of predictive accuracy (Nagelkerke's R<sup>2</sup>), discrimination (C-statistics), and calibration plots will be generated. Internal validation will be conducted using a bootstrapping approach in 500 samples with replacement, and an optimism-corrected C-statistic will be calculated. External validation of the RPM will be conducted by applying the model in a large population-based pregnancy cohort in Alberta, and estimating the above measures of model performance. The model will be re-calibrated by adjusting baseline hazards and coefficients where appropriate.</p><p><strong>Discussion: </strong>The derived RPM may help identify women at high risk of developing DM in a 5-year period after pregnancy, thus facilitate lifestyle changes for women at higher risk, as well as more frequent screening for type 2 DM after pregnancy.</p>","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"5"},"PeriodicalIF":0.0,"publicationDate":"2021-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7938478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25451708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-02-08DOI: 10.1186/s41512-021-00093-8
Brian D Nicholson, Gail Hayward, Philip J Turner, Joseph J Lee, Alexandra Deeks, Mary Logan, Abigail Moore, Anna Seeley, Thomas Fanshawe, Jason Oke, Constantinos Koshiaris, James P Sheppard, Uy Hoang, Vaishnavi Parimalanathan, George Edwards, Harshana Liyange, Julian Sherlock, Rachel Byford, Maria Zambon, Joanna Ellis, Jamie Lopez Bernal, Gayatri Amirthalingam, Ezra Linley, Ray Borrow, Gary Howsam, Sophie Baines, Filipa Ferreira, Simon de Lusignan, Rafael Perera, F D Richard Hobbs
Background: The aim of RApid community Point-of-care Testing fOR COVID-19 (RAPTOR-C19) is to assess the diagnostic accuracy of multiple current and emerging point-of-care tests (POCTs) for active and past SARS-CoV2 infection in the community setting. RAPTOR-C19 will provide the community testbed to the COVID-19 National DiagnOstic Research and Evaluation Platform (CONDOR).
Methods: RAPTOR-C19 incorporates a series of prospective observational parallel diagnostic accuracy studies of SARS-CoV2 POCTs against laboratory and composite reference standards in patients with suspected current or past SARS-CoV2 infection attending community settings. Adults and children with suspected current SARS-CoV2 infection who are having an oropharyngeal/nasopharyngeal (OP/NP) swab for laboratory SARS-CoV2 reverse transcriptase Digital/Real-Time Polymerase Chain Reaction (d/rRT-PCR) as part of clinical care or community-based testing will be invited to participate. Adults (≥ 16 years) with suspected past symptomatic infection will also be recruited. Asymptomatic individuals will not be eligible. At the baseline visit, all participants will be asked to submit samples for at least one candidate point-of-care test (POCT) being evaluated (index test/s) as well as an OP/NP swab for laboratory SARS-CoV2 RT-PCR performed by Public Health England (PHE) (reference standard for current infection). Adults will also be asked for a blood sample for laboratory SARS-CoV-2 antibody testing by PHE (reference standard for past infection), where feasible adults will be invited to attend a second visit at 28 days for repeat antibody testing. Additional study data (e.g. demographics, symptoms, observations, household contacts) will be captured electronically. Sensitivity, specificity, positive, and negative predictive values for each POCT will be calculated with exact 95% confidence intervals when compared to the reference standard. POCTs will also be compared to composite reference standards constructed using paired antibody test results, patient reported outcomes, linked electronic health records for outcomes related to COVID-19 such as hospitalisation or death, and other test results.
Discussion: High-performing POCTs for community use could be transformational. Real-time results could lead to personal and public health impacts such as reducing onward household transmission of SARS-CoV2 infection, improving surveillance of health and social care staff, contributing to accurate prevalence estimates, and understanding of SARS-CoV2 transmission dynamics in the population. In contrast, poorly performing POCTs could have negative effects, so it is necessary to undertake community-based diagnostic accuracy evaluations before rolling these out.
{"title":"Rapid community point-of-care testing for COVID-19 (RAPTOR-C19): protocol for a platform diagnostic study.","authors":"Brian D Nicholson, Gail Hayward, Philip J Turner, Joseph J Lee, Alexandra Deeks, Mary Logan, Abigail Moore, Anna Seeley, Thomas Fanshawe, Jason Oke, Constantinos Koshiaris, James P Sheppard, Uy Hoang, Vaishnavi Parimalanathan, George Edwards, Harshana Liyange, Julian Sherlock, Rachel Byford, Maria Zambon, Joanna Ellis, Jamie Lopez Bernal, Gayatri Amirthalingam, Ezra Linley, Ray Borrow, Gary Howsam, Sophie Baines, Filipa Ferreira, Simon de Lusignan, Rafael Perera, F D Richard Hobbs","doi":"10.1186/s41512-021-00093-8","DOIUrl":"10.1186/s41512-021-00093-8","url":null,"abstract":"<p><strong>Background: </strong>The aim of RApid community Point-of-care Testing fOR COVID-19 (RAPTOR-C19) is to assess the diagnostic accuracy of multiple current and emerging point-of-care tests (POCTs) for active and past SARS-CoV2 infection in the community setting. RAPTOR-C19 will provide the community testbed to the COVID-19 National DiagnOstic Research and Evaluation Platform (CONDOR).</p><p><strong>Methods: </strong>RAPTOR-C19 incorporates a series of prospective observational parallel diagnostic accuracy studies of SARS-CoV2 POCTs against laboratory and composite reference standards in patients with suspected current or past SARS-CoV2 infection attending community settings. Adults and children with suspected current SARS-CoV2 infection who are having an oropharyngeal/nasopharyngeal (OP/NP) swab for laboratory SARS-CoV2 reverse transcriptase Digital/Real-Time Polymerase Chain Reaction (d/rRT-PCR) as part of clinical care or community-based testing will be invited to participate. Adults (≥ 16 years) with suspected past symptomatic infection will also be recruited. Asymptomatic individuals will not be eligible. At the baseline visit, all participants will be asked to submit samples for at least one candidate point-of-care test (POCT) being evaluated (index test/s) as well as an OP/NP swab for laboratory SARS-CoV2 RT-PCR performed by Public Health England (PHE) (reference standard for current infection). Adults will also be asked for a blood sample for laboratory SARS-CoV-2 antibody testing by PHE (reference standard for past infection), where feasible adults will be invited to attend a second visit at 28 days for repeat antibody testing. Additional study data (e.g. demographics, symptoms, observations, household contacts) will be captured electronically. Sensitivity, specificity, positive, and negative predictive values for each POCT will be calculated with exact 95% confidence intervals when compared to the reference standard. POCTs will also be compared to composite reference standards constructed using paired antibody test results, patient reported outcomes, linked electronic health records for outcomes related to COVID-19 such as hospitalisation or death, and other test results.</p><p><strong>Discussion: </strong>High-performing POCTs for community use could be transformational. Real-time results could lead to personal and public health impacts such as reducing onward household transmission of SARS-CoV2 infection, improving surveillance of health and social care staff, contributing to accurate prevalence estimates, and understanding of SARS-CoV2 transmission dynamics in the population. In contrast, poorly performing POCTs could have negative effects, so it is necessary to undertake community-based diagnostic accuracy evaluations before rolling these out.</p><p><strong>Trial registration: </strong>ISRCTN, ISRCTN14226970.</p>","PeriodicalId":72800,"journal":{"name":"Diagnostic and prognostic research","volume":" ","pages":"4"},"PeriodicalIF":0.0,"publicationDate":"2021-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7868893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25345540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}