Bruno Legendre, Damiano Cerasuolo, Olivier Dejardin, Annabel Boyer
{"title":"如何处理丢失的数据?链式方程多重归算:临床实践的建议与解释","authors":"Bruno Legendre, Damiano Cerasuolo, Olivier Dejardin, Annabel Boyer","doi":"10.1684/ndt.2023.24","DOIUrl":null,"url":null,"abstract":"<p><p>The presence of missing data, a constant problem in medical research, has several consequences: systematic loss of power, associated or not with a reduction in the representativeness of the sample analyzed. There are three types of missing data: 1) missing completely at random (MCAR); 2) missing at random (MAR); 3) missing not at random (MNAR). Multiple imputation by chained equations allows for the correct handling of missing data under the MCAR and MAR assumptions. It allows to simulate for each missing data j, a number m of simulated values which seem plausible with regard to the other variables. A random effect is included in this simulation to express the uncertainty. Several data sets are thus created and analyzed individually, in an identical way. Then the estimators of each data set are combined to obtain a global estimator. Multiple imputation increases power, corrects for some biases and has the advantage of being applicable to many types of variables. Complete case analysis should no longer be the norm. The objective of this guide is to help the reader in conducting an analysis with multiple imputed data. We cover the following points: the different types of missing data, the different historical approaches to handling them, and then we detail the multiple imputation method using chained equations. We provide a code example for the mice package of R®.</p>","PeriodicalId":51140,"journal":{"name":"Nephrologie & Therapeutique","volume":"19 3","pages":"171-179"},"PeriodicalIF":0.7000,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How to deal with missing data? Multiple imputation by chained equations: recommendations and explanations for clinical practice\",\"authors\":\"Bruno Legendre, Damiano Cerasuolo, Olivier Dejardin, Annabel Boyer\",\"doi\":\"10.1684/ndt.2023.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The presence of missing data, a constant problem in medical research, has several consequences: systematic loss of power, associated or not with a reduction in the representativeness of the sample analyzed. There are three types of missing data: 1) missing completely at random (MCAR); 2) missing at random (MAR); 3) missing not at random (MNAR). Multiple imputation by chained equations allows for the correct handling of missing data under the MCAR and MAR assumptions. It allows to simulate for each missing data j, a number m of simulated values which seem plausible with regard to the other variables. A random effect is included in this simulation to express the uncertainty. Several data sets are thus created and analyzed individually, in an identical way. Then the estimators of each data set are combined to obtain a global estimator. Multiple imputation increases power, corrects for some biases and has the advantage of being applicable to many types of variables. Complete case analysis should no longer be the norm. The objective of this guide is to help the reader in conducting an analysis with multiple imputed data. We cover the following points: the different types of missing data, the different historical approaches to handling them, and then we detail the multiple imputation method using chained equations. We provide a code example for the mice package of R®.</p>\",\"PeriodicalId\":51140,\"journal\":{\"name\":\"Nephrologie & Therapeutique\",\"volume\":\"19 3\",\"pages\":\"171-179\"},\"PeriodicalIF\":0.7000,\"publicationDate\":\"2023-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nephrologie & Therapeutique\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1684/ndt.2023.24\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"UROLOGY & NEPHROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nephrologie & Therapeutique","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1684/ndt.2023.24","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
How to deal with missing data? Multiple imputation by chained equations: recommendations and explanations for clinical practice
The presence of missing data, a constant problem in medical research, has several consequences: systematic loss of power, associated or not with a reduction in the representativeness of the sample analyzed. There are three types of missing data: 1) missing completely at random (MCAR); 2) missing at random (MAR); 3) missing not at random (MNAR). Multiple imputation by chained equations allows for the correct handling of missing data under the MCAR and MAR assumptions. It allows to simulate for each missing data j, a number m of simulated values which seem plausible with regard to the other variables. A random effect is included in this simulation to express the uncertainty. Several data sets are thus created and analyzed individually, in an identical way. Then the estimators of each data set are combined to obtain a global estimator. Multiple imputation increases power, corrects for some biases and has the advantage of being applicable to many types of variables. Complete case analysis should no longer be the norm. The objective of this guide is to help the reader in conducting an analysis with multiple imputed data. We cover the following points: the different types of missing data, the different historical approaches to handling them, and then we detail the multiple imputation method using chained equations. We provide a code example for the mice package of R®.
期刊介绍:
Organe d''expression de la Société de Néphrologie, de la Société Francophone de Dialyse et de la Société de Néphrologie Pédiatrique, Néphrologie et Thérapeutique a pour vocation de publier des textes en français dans le domaine de la Néphrologie, qu''il s''agisse d''actualisation des connaissances, de recommandations de bonne pratique clinique, de publications originales, ou d''informations sur la vie des trois sociétés fondatrices. La variété des thèmes abordés reflète la richesse de la Néphrologie, qu''il s''agisse d''aspects fondamentaux issus de la physiologie, de l''immunologie, de l''anatomo-pathologie, ou de la génétique, ou de sujets de néphrologie clinique, notamment ceux en rapport avec les thérapeutiques néphrologiques, transplantation, hémodialyse et dialyse péritonéale.