{"title":"Determining and visualising at-risk groups in case-control data.","authors":"R. Marshall","doi":"10.1080/13595220152601819","DOIUrl":null,"url":null,"abstract":"BACKGROUND Case-control research is often exploratory; to determine factors that increase risk. Often, regression methods are used to determine combinations of risk factors that predispose to excess risk. Recently, tree-based methods have also been proposed. Both have limitations. An alternative approach is suggested, based on a search algorithm to identify at-risk subgroups. METHODS Statistical methods to determine and visualise at-risk sub-groups in case-control studies are presented. The method of determining sub-groups--search partition analysis (SPAN)--searches among different Boolean combinations of risk factors. Sub-groups that have been identified are visualised by scaled rectangle diagrams. These show the size of sub-groups and the extent to which they overlap. RESULTS Theory is presented for applying the method to case-control data. The methods are illustrated by analysis of three case-control studies: one on sudden infant death syndrome, a second on heart disease and a third on child pedestrian injuries. CONCLUSIONS The methods provide a useful alternative to regression and tree-based analysis. They demarcate subgroups that, in the three examples, are easy to interpret and would not have been found by other methods. Scaled rectangle diagrams are a useful way to visualise the results.","PeriodicalId":80024,"journal":{"name":"Journal of epidemiology and biostatistics","volume":"6 4 1","pages":"343-8"},"PeriodicalIF":0.0000,"publicationDate":"2001-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of epidemiology and biostatistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/13595220152601819","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
BACKGROUND Case-control research is often exploratory; to determine factors that increase risk. Often, regression methods are used to determine combinations of risk factors that predispose to excess risk. Recently, tree-based methods have also been proposed. Both have limitations. An alternative approach is suggested, based on a search algorithm to identify at-risk subgroups. METHODS Statistical methods to determine and visualise at-risk sub-groups in case-control studies are presented. The method of determining sub-groups--search partition analysis (SPAN)--searches among different Boolean combinations of risk factors. Sub-groups that have been identified are visualised by scaled rectangle diagrams. These show the size of sub-groups and the extent to which they overlap. RESULTS Theory is presented for applying the method to case-control data. The methods are illustrated by analysis of three case-control studies: one on sudden infant death syndrome, a second on heart disease and a third on child pedestrian injuries. CONCLUSIONS The methods provide a useful alternative to regression and tree-based analysis. They demarcate subgroups that, in the three examples, are easy to interpret and would not have been found by other methods. Scaled rectangle diagrams are a useful way to visualise the results.