{"title":"多类别结果风险预测方法比较:二分法逻辑回归与多项式逻辑回归","authors":"lei li, Matthew A. Rysavy, G. Bobashev, Abhik Das","doi":"10.21203/rs.3.rs-3911212/v1","DOIUrl":null,"url":null,"abstract":"Abstract Background Medical outcomes of interest to clinicians may have multiple categories. Researchers face several options for risk prediction of such outcomes, including dichotomized logistic regression and multinomial logit regression modeling. We aimed to compare these methods and provide practical guidance needed. Methods We described dichotomized logistic regression and competing risks regression, and an alternative to standard multinomial logit regression, continuation-ratio logit regression for ordinal outcomes. We then applied these methods to develop prediction models of survival and growth outcomes based on the NICHD Extremely Preterm Birth Outcome Tool model. The statistical and practical advantages and flaws of these methods were examined and both discrimination and calibration of the estimated models were assessed. Results The dichotomized logistic models and multinomial continuation-ratio logit model had similar discrimination and calibration in predicting death and survival without neurodevelopmental impairment. But the continuation-ratio logit model had better discrimination and calibration in predicting probabilities of neurodevelopmental impairment. The sum of predicted probabilities of outcome categories from the logistic models did not equal 100% for about half of the study infants, ranging from 87.7% to 124.0%, and the logistic model of neurodevelopmental impairment greatly overpredicted the risk among low-risk infants and underpredicted among high-risk infants. Conclusions Estimating multiple logistic regression models of dichotomized outcomes may result in poorly calibrated predictions. For an outcome with multiple ordinal categories, continuation-ratio logit regression is a useful alternative to standard multinomial logit regression. It produces better calibrated predictions and has the advantages of simplicity in model interpretation and flexibility to include outcome category-specific predictors and random-effect terms for patient heterogeneity by hospital.","PeriodicalId":21039,"journal":{"name":"Research Square","volume":"54 37","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparing methods for risk prediction of multicategory outcomes: dichotomized logistic regression vs. multinomial logit regression\",\"authors\":\"lei li, Matthew A. Rysavy, G. Bobashev, Abhik Das\",\"doi\":\"10.21203/rs.3.rs-3911212/v1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Background Medical outcomes of interest to clinicians may have multiple categories. Researchers face several options for risk prediction of such outcomes, including dichotomized logistic regression and multinomial logit regression modeling. We aimed to compare these methods and provide practical guidance needed. Methods We described dichotomized logistic regression and competing risks regression, and an alternative to standard multinomial logit regression, continuation-ratio logit regression for ordinal outcomes. We then applied these methods to develop prediction models of survival and growth outcomes based on the NICHD Extremely Preterm Birth Outcome Tool model. The statistical and practical advantages and flaws of these methods were examined and both discrimination and calibration of the estimated models were assessed. Results The dichotomized logistic models and multinomial continuation-ratio logit model had similar discrimination and calibration in predicting death and survival without neurodevelopmental impairment. But the continuation-ratio logit model had better discrimination and calibration in predicting probabilities of neurodevelopmental impairment. The sum of predicted probabilities of outcome categories from the logistic models did not equal 100% for about half of the study infants, ranging from 87.7% to 124.0%, and the logistic model of neurodevelopmental impairment greatly overpredicted the risk among low-risk infants and underpredicted among high-risk infants. Conclusions Estimating multiple logistic regression models of dichotomized outcomes may result in poorly calibrated predictions. For an outcome with multiple ordinal categories, continuation-ratio logit regression is a useful alternative to standard multinomial logit regression. It produces better calibrated predictions and has the advantages of simplicity in model interpretation and flexibility to include outcome category-specific predictors and random-effect terms for patient heterogeneity by hospital.\",\"PeriodicalId\":21039,\"journal\":{\"name\":\"Research Square\",\"volume\":\"54 37\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research Square\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21203/rs.3.rs-3911212/v1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Square","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21203/rs.3.rs-3911212/v1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparing methods for risk prediction of multicategory outcomes: dichotomized logistic regression vs. multinomial logit regression
Abstract Background Medical outcomes of interest to clinicians may have multiple categories. Researchers face several options for risk prediction of such outcomes, including dichotomized logistic regression and multinomial logit regression modeling. We aimed to compare these methods and provide practical guidance needed. Methods We described dichotomized logistic regression and competing risks regression, and an alternative to standard multinomial logit regression, continuation-ratio logit regression for ordinal outcomes. We then applied these methods to develop prediction models of survival and growth outcomes based on the NICHD Extremely Preterm Birth Outcome Tool model. The statistical and practical advantages and flaws of these methods were examined and both discrimination and calibration of the estimated models were assessed. Results The dichotomized logistic models and multinomial continuation-ratio logit model had similar discrimination and calibration in predicting death and survival without neurodevelopmental impairment. But the continuation-ratio logit model had better discrimination and calibration in predicting probabilities of neurodevelopmental impairment. The sum of predicted probabilities of outcome categories from the logistic models did not equal 100% for about half of the study infants, ranging from 87.7% to 124.0%, and the logistic model of neurodevelopmental impairment greatly overpredicted the risk among low-risk infants and underpredicted among high-risk infants. Conclusions Estimating multiple logistic regression models of dichotomized outcomes may result in poorly calibrated predictions. For an outcome with multiple ordinal categories, continuation-ratio logit regression is a useful alternative to standard multinomial logit regression. It produces better calibrated predictions and has the advantages of simplicity in model interpretation and flexibility to include outcome category-specific predictors and random-effect terms for patient heterogeneity by hospital.