Pub Date : 2021-03-17DOI: 10.1080/24709360.2021.1878406
G. Pennello
In diagnostic test evaluation, performance goals are often set for classification accuracy measures such as specificity, sensitivity and diagnostic likelihood ratio. For tests that detect rare conditions, classification accuracy goals are attractive because they can be evaluated in case-control studies enriched for the condition. A neglected area of research is determining classification accuracy goals that confer clinical usefulness of a test. We determine classification accuracy goals based on desired risk stratification, i.e. the post-test risk of having the condition compared with the pre-test risk. We determine goals for rule-out tests, rule-in tests, and those that do both. Goals for negative and positive likelihood ratios (NLR, PLR) are emphasized because of their natural relationships with risk stratification via Bayes Theorem. Goals for specificity and sensitivity are implied by goals on NLR and PLR. Goals that confer superiority or non-inferiority of a test to a comparator are based on approximating risk differences and relative risks by functions of likelihood ratios. Inference is based on Wald confidence intervals for ratios of likelihood ratios. To illustrate, we consider hypothetical data on a fetal fibronectin assay for ruling out risk of pre-term birth and two human papillomavirus assays for detecting cervical cancer. Trial registration ClinicalTrials.gov identifier: NCT01931566.
{"title":"Classification accuracy goals for diagnostic tests based on risk stratification","authors":"G. Pennello","doi":"10.1080/24709360.2021.1878406","DOIUrl":"https://doi.org/10.1080/24709360.2021.1878406","url":null,"abstract":"In diagnostic test evaluation, performance goals are often set for classification accuracy measures such as specificity, sensitivity and diagnostic likelihood ratio. For tests that detect rare conditions, classification accuracy goals are attractive because they can be evaluated in case-control studies enriched for the condition. A neglected area of research is determining classification accuracy goals that confer clinical usefulness of a test. We determine classification accuracy goals based on desired risk stratification, i.e. the post-test risk of having the condition compared with the pre-test risk. We determine goals for rule-out tests, rule-in tests, and those that do both. Goals for negative and positive likelihood ratios (NLR, PLR) are emphasized because of their natural relationships with risk stratification via Bayes Theorem. Goals for specificity and sensitivity are implied by goals on NLR and PLR. Goals that confer superiority or non-inferiority of a test to a comparator are based on approximating risk differences and relative risks by functions of likelihood ratios. Inference is based on Wald confidence intervals for ratios of likelihood ratios. To illustrate, we consider hypothetical data on a fetal fibronectin assay for ruling out risk of pre-term birth and two human papillomavirus assays for detecting cervical cancer. Trial registration ClinicalTrials.gov identifier: NCT01931566.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"5 1","pages":"149 - 168"},"PeriodicalIF":0.0,"publicationDate":"2021-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2021.1878406","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44791322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-02DOI: 10.1080/24709360.2021.1913708
I. Marschner
Case fatality risk (CFR) is the probability of death among cases of a disease. A crude CFR estimate is the ratio of the number deaths to the number of cases of the disease. This estimate is biased, however, particularly during outbreaks of emerging infectious diseases such as COVID-19, because the death time of recent cases is subject to right censoring. Instead, we propose deconvolution methods applied to routinely collected surveillance data of unlinked case and death counts over time. We begin by considering the death series to be the convolution of the case series and the fatality distribution, which is the subdistribution of the time between diagnosis and death. We then use deconvolution methods to estimate this fatality distribution. This provides a CFR estimate together with information about the distribution of time to death. Importantly, this information is extracted without the need to make strong assumptions used in previous analyses. The methods are applied to COVID-19 surveillance data from a range of countries illustrating substantial CFR differences. Simulations show that crude approaches lead to underestimation, particularly early in an outbreak, and that the proposed approach can rectify this bias. An R package called covidSurv is available for implementing the analyses.
{"title":"Case fatality risk estimated from routinely collected disease surveillance data: application to COVID–19","authors":"I. Marschner","doi":"10.1080/24709360.2021.1913708","DOIUrl":"https://doi.org/10.1080/24709360.2021.1913708","url":null,"abstract":"Case fatality risk (CFR) is the probability of death among cases of a disease. A crude CFR estimate is the ratio of the number deaths to the number of cases of the disease. This estimate is biased, however, particularly during outbreaks of emerging infectious diseases such as COVID-19, because the death time of recent cases is subject to right censoring. Instead, we propose deconvolution methods applied to routinely collected surveillance data of unlinked case and death counts over time. We begin by considering the death series to be the convolution of the case series and the fatality distribution, which is the subdistribution of the time between diagnosis and death. We then use deconvolution methods to estimate this fatality distribution. This provides a CFR estimate together with information about the distribution of time to death. Importantly, this information is extracted without the need to make strong assumptions used in previous analyses. The methods are applied to COVID-19 surveillance data from a range of countries illustrating substantial CFR differences. Simulations show that crude approaches lead to underestimation, particularly early in an outbreak, and that the proposed approach can rectify this bias. An R package called covidSurv is available for implementing the analyses.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"5 1","pages":"49 - 68"},"PeriodicalIF":0.0,"publicationDate":"2021-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2021.1913708","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"60127878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-02DOI: 10.1080/24709360.2021.1921944
M. Aoki, H. Noma, M. Gosho
Due to the globalization of drug development, multi-regional clinical trials (MRCTs) have been increasingly adopted in clinical evaluations. In MRCTs, the primary objective is to demonstrate the efficacy of new drugs in all participating regions, but heterogeneity of various relevant factors across these regions can cause inconsistency of treatment effects. In particular, outlying regions with extreme profiles can influence the overall conclusions of these studies. In this article, we propose quantitative methods to detect these outlying regions and to assess their influences in MRCTs. The approaches are as follows: (1) a method using a dfbeta-like measure, a studentized residual obtained by a leave-one-out cross-validation (LOOCV) scheme; (2) a model-based significance testing method using a mean-shifted model; (3) a method using a relative change measure for the variance estimate of the overall effect estimator; and (4) a method using a relative change measure for the heterogeneity variance estimate in a random-effects model. Parametric bootstrap schemes are proposed to accurately assess the statistical significance and variabilities of the aforementioned influence diagnostic tools. We illustrate the effectiveness of these proposed methods via applications to two MRCTs, the RECORD and RENAAL studies.
{"title":"Methods for detecting outlying regions and influence diagnosis in multi-regional clinical trials","authors":"M. Aoki, H. Noma, M. Gosho","doi":"10.1080/24709360.2021.1921944","DOIUrl":"https://doi.org/10.1080/24709360.2021.1921944","url":null,"abstract":"Due to the globalization of drug development, multi-regional clinical trials (MRCTs) have been increasingly adopted in clinical evaluations. In MRCTs, the primary objective is to demonstrate the efficacy of new drugs in all participating regions, but heterogeneity of various relevant factors across these regions can cause inconsistency of treatment effects. In particular, outlying regions with extreme profiles can influence the overall conclusions of these studies. In this article, we propose quantitative methods to detect these outlying regions and to assess their influences in MRCTs. The approaches are as follows: (1) a method using a dfbeta-like measure, a studentized residual obtained by a leave-one-out cross-validation (LOOCV) scheme; (2) a model-based significance testing method using a mean-shifted model; (3) a method using a relative change measure for the variance estimate of the overall effect estimator; and (4) a method using a relative change measure for the heterogeneity variance estimate in a random-effects model. Parametric bootstrap schemes are proposed to accurately assess the statistical significance and variabilities of the aforementioned influence diagnostic tools. We illustrate the effectiveness of these proposed methods via applications to two MRCTs, the RECORD and RENAAL studies.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"5 1","pages":"30 - 48"},"PeriodicalIF":0.0,"publicationDate":"2021-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2021.1921944","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45719376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-02DOI: 10.1080/24709360.2021.1924495
R. Wesonga, Khidir M. Abdelbasit
Geographical regions may have an influence on asthma exacerbation. No conclusive study has been conducted to fully support or dissipate this assertion. We sought to use a data-driven approach to investigate the possible effect of geographical location on asthma. This study was based on data collected by the Ministry of Health over a 6-year period from 2010 to 2015 and presented in their annual reports. Prevalence rates for 11 regions were computed using the analysis of variance and regression models to determine the proximal nature of the region as a risk factor for asthma. Our results show a statistically significant difference in prevalence rates of asthma among the 11 regions. The asthma prevalence rate among the male population was 18% (OR = 1.18, p = .011) more than for the female population. There was a positive marginal increase in the asthma prevalence over the period. Further, five groups were derived based on asthma prevalence rates and trends. The region has proximal risk factor and significantly associated with asthma prevalence over the period. We recommend the creation of a control mechanism that targets regions with higher prevalence and increasing trends.
{"title":"Region as a risk factor for asthma prevalence: statistical evidence from administrative data","authors":"R. Wesonga, Khidir M. Abdelbasit","doi":"10.1080/24709360.2021.1924495","DOIUrl":"https://doi.org/10.1080/24709360.2021.1924495","url":null,"abstract":"Geographical regions may have an influence on asthma exacerbation. No conclusive study has been conducted to fully support or dissipate this assertion. We sought to use a data-driven approach to investigate the possible effect of geographical location on asthma. This study was based on data collected by the Ministry of Health over a 6-year period from 2010 to 2015 and presented in their annual reports. Prevalence rates for 11 regions were computed using the analysis of variance and regression models to determine the proximal nature of the region as a risk factor for asthma. Our results show a statistically significant difference in prevalence rates of asthma among the 11 regions. The asthma prevalence rate among the male population was 18% (OR = 1.18, p = .011) more than for the female population. There was a positive marginal increase in the asthma prevalence over the period. Further, five groups were derived based on asthma prevalence rates and trends. The region has proximal risk factor and significantly associated with asthma prevalence over the period. We recommend the creation of a control mechanism that targets regions with higher prevalence and increasing trends.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"5 1","pages":"19 - 29"},"PeriodicalIF":0.0,"publicationDate":"2021-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2021.1924495","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42433147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-02DOI: 10.1080/24709360.2021.1913709
Rajan Gupta, G. Pandey, S. Pal
Epidemiological modeling is an important problem around the world. This research presents COVID-19 analysis to understand which model works better for different regions. A comparative analysis of three growth curve fitting models (Gompertz, Logistic, and Exponential), two mathematical models (SEIR and IDEA), two forecasting models (Holt's exponential and ARIMA), and four machine/deep learning models (Neural Network, LSTM Networks, GANs, and Random Forest) using three evaluation criteria on ten prominent regions around the world from North America, South America, Europe, and Asia has been presented. The minimum and median values for RMSE were 1.8 and 5372.9; the values for the mean absolute percentage error were 0.005 and 6.63; and the values for AIC were 87.07 and 613.3, respectively, from a total of 125 experiments across 10 regions. The growth curve fitting models worked well where flattening of the cases has started. Based on region's growth curve, a relevant model from the list can be used for predicting the number of infected cases for COVID-19. Some other models used in forecasting the number of cases have been added in the future work section, which can help researchers to forecast the number of cases in different regions of the world.
{"title":"Comparative analysis of epidemiological models for COVID-19 pandemic predictions","authors":"Rajan Gupta, G. Pandey, S. Pal","doi":"10.1080/24709360.2021.1913709","DOIUrl":"https://doi.org/10.1080/24709360.2021.1913709","url":null,"abstract":"Epidemiological modeling is an important problem around the world. This research presents COVID-19 analysis to understand which model works better for different regions. A comparative analysis of three growth curve fitting models (Gompertz, Logistic, and Exponential), two mathematical models (SEIR and IDEA), two forecasting models (Holt's exponential and ARIMA), and four machine/deep learning models (Neural Network, LSTM Networks, GANs, and Random Forest) using three evaluation criteria on ten prominent regions around the world from North America, South America, Europe, and Asia has been presented. The minimum and median values for RMSE were 1.8 and 5372.9; the values for the mean absolute percentage error were 0.005 and 6.63; and the values for AIC were 87.07 and 613.3, respectively, from a total of 125 experiments across 10 regions. The growth curve fitting models worked well where flattening of the cases has started. Based on region's growth curve, a relevant model from the list can be used for predicting the number of infected cases for COVID-19. Some other models used in forecasting the number of cases have been added in the future work section, which can help researchers to forecast the number of cases in different regions of the world.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"5 1","pages":"69 - 91"},"PeriodicalIF":0.0,"publicationDate":"2021-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2021.1913709","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48622498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-01Epub Date: 2021-05-25DOI: 10.1080/24709360.2021.1926189
Norberto Pantoja-Galicia, Olivia I Okereke, Deborah Blacker, Rebecca A Betensky
The receiver operating characteristic (ROC) curve displays sensitivity versus 1-specificity over a set of thresholds. The area under the ROC curve (AUC) is a global scalar summary of this curve. In the context of time-dependent ROC methods, we are interested in global scalar measures that summarize sequences of time-dependent AUCs over time. The concordance probability is a candidate for such purposes. The concordance probability can provide a global assessment of the discrimination ability of a test for an event that occurs at random times and may be right censored. If the test adequately differentiates between subjects who survive longer times and those who survive shorter times, this will assist clinical decisions. In this context the concordance probability may support assessment of precision medicine tools based on prognostic biomarkers models for overall survival. Definitions of time-dependent sensitivity and specificity are reviewed. Some connections between such definitions and concordance measures are also reviewed and we establish new connections via new measures of global concordance. We explore the relationship between such measures and their corresponding time-dependent AUC. To illustrate these concepts, an application in the context of Alzheimer's disease is presented.
{"title":"Concordance Measures and Time-Dependent ROC Methods.","authors":"Norberto Pantoja-Galicia, Olivia I Okereke, Deborah Blacker, Rebecca A Betensky","doi":"10.1080/24709360.2021.1926189","DOIUrl":"10.1080/24709360.2021.1926189","url":null,"abstract":"<p><p>The receiver operating characteristic (ROC) curve displays sensitivity versus 1-specificity over a set of thresholds. The area under the ROC curve (AUC) is a global scalar summary of this curve. In the context of time-dependent ROC methods, we are interested in global scalar measures that summarize sequences of time-dependent AUCs over time. The concordance probability is a candidate for such purposes. The concordance probability can provide a global assessment of the discrimination ability of a test for an event that occurs at random times and may be right censored. If the test adequately differentiates between subjects who survive longer times and those who survive shorter times, this will assist clinical decisions. In this context the concordance probability may support assessment of precision medicine tools based on prognostic biomarkers models for overall survival. Definitions of time-dependent sensitivity and specificity are reviewed. Some connections between such definitions and concordance measures are also reviewed and we establish new connections via new measures of global concordance. We explore the relationship between such measures and their corresponding time-dependent AUC. To illustrate these concepts, an application in the context of Alzheimer's disease is presented.</p>","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":" ","pages":"232-249"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9523576/pdf/nihms-1701256.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40389965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-07DOI: 10.1080/24709360.2020.1796176
H. Katki, R. Dey, P. Saha-Chaudhuri
Risk stratification is the ability of a test or model to separate those at high vs. low risk of disease. There is no risk stratification metric that is in terms of the number of people requiring testing, which would help with considering the benefits, harms, and costs associated with the test and interventions. We introduce the expected number needed to test (NNtest) to identify one more disease case than by randomly selecting people for disease ascertainment. We show that NNtest measures risk stratification, allowing us to decompose NNtest into components that contrast the increase in risk upon testing positive (‘concern’) versus the decrease in risk upon testing negative (‘reassurance’). A graph of the reciprocals of concern vs. reassurance have linear contours of constant NNtest, visualizing the relative importance and tradeoff of each component to better understand the properties of risk thresholds with equal NNtest. We apply NNtest to the controversy over the risk threshold for who should get testing for BRCA1/2 mutations that cause high risks of breast and ovarian cancers. We show that risk thresholds between 0.78% and 5% optimize NNtest. At these thresholds, people will require risk-model evaluation to find one more mutation-carrier. However, these thresholds of equal NNtest provide very different concern and reassurance, with 0.78% providing much more reassurance (and thus much less concern) than 5%. Given that genetic testing costs are declining rapidly, the greater reassurance provided by the 0.78% threshold might be deemed more important than the greater concern provided by the 5% threshold.
{"title":"Number needed to test: quantifying risk stratification provided by diagnostic tests and risk predictions","authors":"H. Katki, R. Dey, P. Saha-Chaudhuri","doi":"10.1080/24709360.2020.1796176","DOIUrl":"https://doi.org/10.1080/24709360.2020.1796176","url":null,"abstract":"Risk stratification is the ability of a test or model to separate those at high vs. low risk of disease. There is no risk stratification metric that is in terms of the number of people requiring testing, which would help with considering the benefits, harms, and costs associated with the test and interventions. We introduce the expected number needed to test (NNtest) to identify one more disease case than by randomly selecting people for disease ascertainment. We show that NNtest measures risk stratification, allowing us to decompose NNtest into components that contrast the increase in risk upon testing positive (‘concern’) versus the decrease in risk upon testing negative (‘reassurance’). A graph of the reciprocals of concern vs. reassurance have linear contours of constant NNtest, visualizing the relative importance and tradeoff of each component to better understand the properties of risk thresholds with equal NNtest. We apply NNtest to the controversy over the risk threshold for who should get testing for BRCA1/2 mutations that cause high risks of breast and ovarian cancers. We show that risk thresholds between 0.78% and 5% optimize NNtest. At these thresholds, people will require risk-model evaluation to find one more mutation-carrier. However, these thresholds of equal NNtest provide very different concern and reassurance, with 0.78% providing much more reassurance (and thus much less concern) than 5%. Given that genetic testing costs are declining rapidly, the greater reassurance provided by the 0.78% threshold might be deemed more important than the greater concern provided by the 5% threshold.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"5 1","pages":"134 - 148"},"PeriodicalIF":0.0,"publicationDate":"2020-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2020.1796176","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42483253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-07-02DOI: 10.1080/24709360.2021.1936421
Shasha Han, D. Rubin
Basic propensity score methodology is designed to balance the distributions of multivariate pre-treatment covariates when comparing one active treatment with one control treatment. However, practical settings often involve comparing more than two treatments, where more complicated contrasts than the basic treatment-control one, , are relevant. Here, we propose the use of contrast-specific propensity scores (CSPS), which allows the creation of treatment groups of units that are balanced with respect to bifurcations of the specified contrasts and the multivariate space spanned by these bifurcations.
{"title":"Contrast-specific propensity scores","authors":"Shasha Han, D. Rubin","doi":"10.1080/24709360.2021.1936421","DOIUrl":"https://doi.org/10.1080/24709360.2021.1936421","url":null,"abstract":"Basic propensity score methodology is designed to balance the distributions of multivariate pre-treatment covariates when comparing one active treatment with one control treatment. However, practical settings often involve comparing more than two treatments, where more complicated contrasts than the basic treatment-control one, , are relevant. Here, we propose the use of contrast-specific propensity scores (CSPS), which allows the creation of treatment groups of units that are balanced with respect to bifurcations of the specified contrasts and the multivariate space spanned by these bifurcations.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"5 1","pages":"1 - 8"},"PeriodicalIF":0.0,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2021.1936421","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46223106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-06-12DOI: 10.1080/24709360.2021.1978270
D. Sisti, E. Rocchi, S. Peluso, S. Amatori, M. Carletti
The novel coronavirus SARS-CoV-2 was first identified in China in December 2019. In just over five months, the virus affected over 4 million people and caused about 300,000 deaths. This study aimed to model new COVID-19 cases in Italian regions using a new curve. A new empirical curve is proposed to model the number of new cases of COVID-19. It resembles a known exponential growth curve, which has a straight line as an exponent, but in the growth curve proposed, the exponent is a logistic curve multiplied for a straight line. This curve shows an initial phase, the expected exponential growth, then rises to the maximum value and finally reaches zero. We characterized the epidemic growth patterns for the entire Italian nation and each of the 20 Italian regions. The estimated growth curve has been used to calculate the expected time of the beginning, the time related to peak, and the end of the epidemics. Our analysis explores the development of the outbreaks in Italy and the impact of the containment measures. Data obtained are useful to forecast future scenarios and the possible end of the epidemic.
{"title":"A new regression model for the forecasting of COVID-19 outbreak evolution: an application to Italian data","authors":"D. Sisti, E. Rocchi, S. Peluso, S. Amatori, M. Carletti","doi":"10.1080/24709360.2021.1978270","DOIUrl":"https://doi.org/10.1080/24709360.2021.1978270","url":null,"abstract":"The novel coronavirus SARS-CoV-2 was first identified in China in December 2019. In just over five months, the virus affected over 4 million people and caused about 300,000 deaths. This study aimed to model new COVID-19 cases in Italian regions using a new curve. A new empirical curve is proposed to model the number of new cases of COVID-19. It resembles a known exponential growth curve, which has a straight line as an exponent, but in the growth curve proposed, the exponent is a logistic curve multiplied for a straight line. This curve shows an initial phase, the expected exponential growth, then rises to the maximum value and finally reaches zero. We characterized the epidemic growth patterns for the entire Italian nation and each of the 20 Italian regions. The estimated growth curve has been used to calculate the expected time of the beginning, the time related to peak, and the end of the epidemics. Our analysis explores the development of the outbreaks in Italy and the impact of the containment measures. Data obtained are useful to forecast future scenarios and the possible end of the epidemic.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"6 1","pages":"48 - 56"},"PeriodicalIF":0.0,"publicationDate":"2020-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44476898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-01-01DOI: 10.1080/24709360.2020.1794705
Hua Shen, R. Cook
ABSTRACT We consider the setting in which a categorical exposure variable of interest can only be measured subject to misclassification via surrogate variables. These surrogate variables may represent the classification of an individual via imperfect diagnostic tests. In such settings, a random number of diagnostic tests may be ordered at the discretion of a treating physician with the decision to order further tests made in a sequential fashion based on the results of preliminary test results. Because the underlying latent status is not ascertainable these cheaper but imperfect surrogate test results are used in lieu of the definitive classification in a model for a long-term outcome. Naive use of a single surrogate or functions of the available surrogates can lead to biased estimators of the association and invalid inference. We propose a likelihood-based approach for modeling the effect of the latent variable in the absence of validation data with estimation based on an expectation–maximization (EM) algorithm. The method yields consistent and efficient estimates and is shown to out-perform several common alternative approaches. The performance of the proposed method is demonstrated in simulation studies and its utility is illustrated by applying the proposed method to the stimulating study on breast cancer.
{"title":"Regression with incomplete multivariate surrogate responses for a latent covariate","authors":"Hua Shen, R. Cook","doi":"10.1080/24709360.2020.1794705","DOIUrl":"https://doi.org/10.1080/24709360.2020.1794705","url":null,"abstract":"ABSTRACT We consider the setting in which a categorical exposure variable of interest can only be measured subject to misclassification via surrogate variables. These surrogate variables may represent the classification of an individual via imperfect diagnostic tests. In such settings, a random number of diagnostic tests may be ordered at the discretion of a treating physician with the decision to order further tests made in a sequential fashion based on the results of preliminary test results. Because the underlying latent status is not ascertainable these cheaper but imperfect surrogate test results are used in lieu of the definitive classification in a model for a long-term outcome. Naive use of a single surrogate or functions of the available surrogates can lead to biased estimators of the association and invalid inference. We propose a likelihood-based approach for modeling the effect of the latent variable in the absence of validation data with estimation based on an expectation–maximization (EM) algorithm. The method yields consistent and efficient estimates and is shown to out-perform several common alternative approaches. The performance of the proposed method is demonstrated in simulation studies and its utility is illustrated by applying the proposed method to the stimulating study on breast cancer.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"4 1","pages":"247 - 264"},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24709360.2020.1794705","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45075884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}