Completeness, agreement, and representativeness of ethnicity recording in the United Kingdom's Clinical Practice Research Datalink (CPRD) and linked Hospital Episode Statistics (HES).
Suhail I Shiekh, Mia Harley, Rebecca E Ghosh, Mark Ashworth, Puja Myles, Helen P Booth, Eleanor L Axson
{"title":"Completeness, agreement, and representativeness of ethnicity recording in the United Kingdom's Clinical Practice Research Datalink (CPRD) and linked Hospital Episode Statistics (HES).","authors":"Suhail I Shiekh, Mia Harley, Rebecca E Ghosh, Mark Ashworth, Puja Myles, Helen P Booth, Eleanor L Axson","doi":"10.1186/s12963-023-00302-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>This descriptive study assessed the completeness, agreement, and representativeness of ethnicity recording in the United Kingdom (UK) Clinical Practice Research Datalink (CPRD) primary care databases alone and, for those patients registered with a GP in England, when linked to secondary care data from Hospital Episode Statistics (HES).</p><p><strong>Methods: </strong>Ethnicity records were assessed for all patients in the May 2021 builds of the CPRD GOLD and CPRD Aurum databases for all UK patients. In analyses of the UK, English data was from combined CPRD-HES, whereas data from Northern Ireland, Scotland, and Wales drew from CPRD only. The agreement of ethnicity records per patient was assessed within each dataset (CPRD GOLD, CPRD Aurum, and HES datasets) and between datasets at the highest level ethnicity categorisation ('Asian', 'black', 'mixed', 'white', 'other'). Representativeness was assessed by comparing the ethnic distributions at the highest-level categorisation of CPRD-HES to those from the Census 2011 across the UK's devolved administrations. Additionally, CPRD-HES was compared to the experimental ethnic distributions for England and Wales from the Office for National Statistics in 2019 (ONS2019) and the English ethnic distribution from May 2021 from NHS Digital's General Practice Extraction Service Data for Pandemic Planning and Research with HES data linkage (GDPPR-HES).</p><p><strong>Results: </strong>In CPRD-HES, 81.7% of currently registered patients in the UK had ethnicity recorded in primary care. For patients with multiple ethnicity records, mismatched ethnicity within individual primary and secondary care datasets was < 10%. Of English patients with ethnicity recorded in both CPRD and HES, 93.3% of records matched at the highest-level categorisation; however, the level of agreement was markedly lower in the 'mixed' and 'other' ethnic groups. CPRD-HES was less proportionately 'white' compared to the UK Census 2011 (80.3% vs. 87.2%) and experimental ONS2019 data (80.4% vs. 84.3%). CPRD-HES was aligned with the ethnic distribution from GDPPR-HES ('white' 80.4% vs. 80.7%); however, with a smaller proportion classified as 'other' (1.1% vs. 2.8%).</p><p><strong>Conclusions: </strong>CPRD-HES has suitable representation of all ethnic categories with some overrepresentation of minority ethnic groups and a smaller proportion classified as 'other' compared to the UK general population from other data sources. CPRD-HES data is useful for studying health risks and outcomes in typically underrepresented groups.</p>","PeriodicalId":51476,"journal":{"name":"Population Health Metrics","volume":"21 1","pages":"3"},"PeriodicalIF":3.2000,"publicationDate":"2023-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10013294/pdf/","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Population Health Metrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12963-023-00302-0","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 4
Abstract
Background: This descriptive study assessed the completeness, agreement, and representativeness of ethnicity recording in the United Kingdom (UK) Clinical Practice Research Datalink (CPRD) primary care databases alone and, for those patients registered with a GP in England, when linked to secondary care data from Hospital Episode Statistics (HES).
Methods: Ethnicity records were assessed for all patients in the May 2021 builds of the CPRD GOLD and CPRD Aurum databases for all UK patients. In analyses of the UK, English data was from combined CPRD-HES, whereas data from Northern Ireland, Scotland, and Wales drew from CPRD only. The agreement of ethnicity records per patient was assessed within each dataset (CPRD GOLD, CPRD Aurum, and HES datasets) and between datasets at the highest level ethnicity categorisation ('Asian', 'black', 'mixed', 'white', 'other'). Representativeness was assessed by comparing the ethnic distributions at the highest-level categorisation of CPRD-HES to those from the Census 2011 across the UK's devolved administrations. Additionally, CPRD-HES was compared to the experimental ethnic distributions for England and Wales from the Office for National Statistics in 2019 (ONS2019) and the English ethnic distribution from May 2021 from NHS Digital's General Practice Extraction Service Data for Pandemic Planning and Research with HES data linkage (GDPPR-HES).
Results: In CPRD-HES, 81.7% of currently registered patients in the UK had ethnicity recorded in primary care. For patients with multiple ethnicity records, mismatched ethnicity within individual primary and secondary care datasets was < 10%. Of English patients with ethnicity recorded in both CPRD and HES, 93.3% of records matched at the highest-level categorisation; however, the level of agreement was markedly lower in the 'mixed' and 'other' ethnic groups. CPRD-HES was less proportionately 'white' compared to the UK Census 2011 (80.3% vs. 87.2%) and experimental ONS2019 data (80.4% vs. 84.3%). CPRD-HES was aligned with the ethnic distribution from GDPPR-HES ('white' 80.4% vs. 80.7%); however, with a smaller proportion classified as 'other' (1.1% vs. 2.8%).
Conclusions: CPRD-HES has suitable representation of all ethnic categories with some overrepresentation of minority ethnic groups and a smaller proportion classified as 'other' compared to the UK general population from other data sources. CPRD-HES data is useful for studying health risks and outcomes in typically underrepresented groups.
期刊介绍:
Population Health Metrics aims to advance the science of population health assessment, and welcomes papers relating to concepts, methods, ethics, applications, and summary measures of population health. The journal provides a unique platform for population health researchers to share their findings with the global community. We seek research that addresses the communication of population health measures and policy implications to stakeholders; this includes papers related to burden estimation and risk assessment, and research addressing population health across the full range of development. Population Health Metrics covers a broad range of topics encompassing health state measurement and valuation, summary measures of population health, descriptive epidemiology at the population level, burden of disease and injury analysis, disease and risk factor modeling for populations, and comparative assessment of risks to health at the population level. The journal is also interested in how to use and communicate indicators of population health to reduce disease burden, and the approaches for translating from indicators of population health to health-advancing actions. As a cross-cutting topic of importance, we are particularly interested in inequalities in population health and their measurement.